Skip to content
46 changes: 46 additions & 0 deletions .github/workflow/build.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
name: Build

on: push

jobs:
unittest:
name: Build wheel
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.10"]

steps:
# Checks out a copy of your repository on the ubuntu-latest machine
- name: Checkout code
uses: actions/checkout@v3

# Select correct version of Python
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v3
with:
python-version: ${{ matrix.python-version }}

# Install invoke
- name: Install setuptools, invoke and virtualenv
run: |
python -m pip install --upgrade pip setuptools virtualenv wheel build
python -m pip install invoke

# Install the python package
- name: Install python package
run: |
invoke install

# Build the Python package
- name: Build python package
run: |
invoke build

# Archive production artifacts
- name: Archive built Python artifacts
uses: actions/upload-artifact@v3
with:
name: dist
path: |
dist
43 changes: 43 additions & 0 deletions .github/workflow/ci_cd.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
name: Unittests and lint

on: push

jobs:
unittest:
name: Run the unit test and linter
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.10"]

steps:
# Checks out a copy of your repository on the ubuntu-latest machine
- name: Checkout code
uses: actions/checkout@v3

# Select correct version of Python
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v3
with:
python-version: ${{ matrix.python-version }}

# Install invoke
- name: Install setuptools, invoke and virtualenv
run: |
python -m pip install --upgrade pip setuptools virtualenv wheel
python -m pip install invoke

# Install the python package
- name: Install python package
run: |
invoke install --extra test

# Run the unit tests
- name: Run unit tests
run: |
invoke test --coverage

# Run the linter
- name: Run linter
run: |
invoke lint
49 changes: 49 additions & 0 deletions .github/workflow/publish.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
name: Publish documentation

on:
push:
branches:
- documentation
- main

permissions:
contents: write
id-token: write
pages: write

# Allow only one concurrent deployment, skipping runs queued between the run in-progress and latest queued.
# However, do NOT cancel in-progress runs as we want to allow these production deployments to complete.
concurrency:
group: "pages"
cancel-in-progress: false

jobs:
build:
name: Build and publish documentation
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.10"]

steps:
- name: Checkout code
uses: actions/checkout@v3

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v3
with:
python-version: ${{ matrix.python-version }}

- name: Install setuptools and invoke
run: |
python -m pip install --upgrade pip setuptools wheel
python -m pip install invoke

- name: Build documentation
run: |
invoke build --docs

- name: Publish documentation
uses: JamesIves/github-pages-deploy-action@v4
with:
folder: docs
8 changes: 8 additions & 0 deletions _config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
title: YouTrend
description: YouTube Trending Video
theme: jekyll-theme-minimal

# Color Customization
minima:
color: "#FF5733" # Main color (replace with your desired red color code)
accent_color: "#FF5733" # Acc
25 changes: 25 additions & 0 deletions docs/app.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
Certainly! Let's enhance the documentation for your web application with Dash:

---

# Web Application with Dash

## Overview

This web application built with Dash serves multiple purposes, catering to both users interested in obtaining statistics about trending YouTube videos and content creators. Below is a breakdown of the app's structure:

### 1. Main Page

In the "Main" page, users can explore the current ranking of the most popular videos based on the data collected at the moment. This section provides a real-time snapshot of the trending videos on YouTube.

### 2. Descriptive Statistics Page

The "Descriptive Statistics" page offers additional insights into the characteristics of trending videos. Users can access statistics related to themes, creators, and other relevant information. This section provides a more in-depth analysis to understand trends and patterns.

### 3. Duration Model Page

On the "Duration Model" page, users can leverage a predictive model to analyze the probability of a YouTube video reaching the trending section. The model considers various features such as video length, creator's subscriber count, and publication date to calculate the survival probability of a video.

### 4. Diffusion Model Page

The "Diffusion Model" page is designed for video creators. It enables them to generate thumbnails for their videos based on a text input, such as the video title. To access this feature, users need to provide a token generated on the Hugging Face website. The application provides a convenient way for creators to enhance their video presentation.
151 changes: 151 additions & 0 deletions docs/duration_model.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
## Duration Model: Navigating YouTube Trending Probabilities

Welcome to the Duration Model, where we embark on a journey through three main sections – **Process Data**, **Make Prediction**, and **Utility**. This module is designed to analyze the likelihood of a video entering the trending list and the probability of it not gaining traction. Let's explore each section step by step.



### Make Prediction

Welcome to the heart of the Duration Model – the **Make Prediction** section. This module provides robust functions for analyzing the probability of YouTube videos not reaching the trending section based on various features, including video length, creator's subscriber count, and publication date.

#### Functions:

##### `survival_probability(video_link, date, api_key, region_code, video_cat_enc)`

Calculate the survival probability for a video.

- **Parameters:**
- `video_link`: The link to the video.
- `date`: Date for calculating the survival probability.
- `api_key`: YouTube Data API key.
- `region_code`: Region code for fetching category labels.
- `video_cat_enc`: One-hot encoder for video categories.

- **Returns:** Survival probability as a float.

##### `plot_survival_probability(single_df, start_date, duration_days, gap, video_link, api_key, region_code, video_cat_enc)`

Plot the survival probability over a specified duration for a given video.

- **Parameters:**
- `single_df`: DataFrame containing details of a single video.
- `start_date`: Starting date for the survival probability calculation.
- `duration_days`: Duration for which survival probability is calculated.
- `gap`: Time gap between survival probability points.
- `video_link`: The link to the video.
- `api_key`: YouTube Data API key.
- `region_code`: Region code for fetching category labels.
- `video_cat_enc`: One-hot encoder for video categories.

- **Returns:** X and Y coordinates for plotting the survival probability.

#### How to Use the Prediction Module:

1. **Survival Probability Calculation:**
- Use `survival_probability` function to calculate the survival probability for a specific video based on its features.

```python
prob = survival_probability(
video_link="your_video_link",
date="2023-01-01",
api_key="your_api_key",
region_code="US",
video_cat_enc=VIDEO_CAT_ENCODER
)
```

2. **Plotting Survival Probability:**
- Utilize `plot_survival_probability` function to visualize the survival probability over a specified duration for a given video.

```python
x, y = plot_survival_probability(
single_df=your_single_video_data_frame,
start_date="2023-01-01",
duration_days=30,
gap=1,
video_link="your_video_link",
api_key="your_api_key",
region_code="US",
video_cat_enc=VIDEO_CAT_ENCODER
)
```

Unlock the potential of predicting YouTube trending probabilities with precision using YouTrend's Duration Model. Leverage these functions to make informed decisions about your content strategy and optimize your chances of reaching the trending list!

## Utility Functions

Ladies and gentlemen, let's dive into the powerhouse of the Duration Model - the **Utility Module**. This module is the backbone, providing essential functions for extracting, processing, and analyzing YouTube video data. It's the wizard behind the scenes, making predictions about the survival probability of a video on the platform.

### Main Functions:

#### `get_video_details(video_link, api_key, region_code, video_cat_enc) -> pd.DataFrame`

Fetches comprehensive details of a YouTube video using its link.

- **Parameters:**
- `video_link`: The link to the YouTube video.
- `api_key`: Your YouTube Data API key.
- `region_code`: The region code for fetching category labels. Default is "US".
- `video_cat_enc`: Optional OneHotEncoder for video categories.

- **Returns:** DataFrame containing rich information about the YouTube video.

#### `get_video_id(video_link) -> str`

Extracts the video ID from a YouTube video link.

- **Parameters:**
- `video_link`: The YouTube video link.

- **Returns:** The extracted video ID as a string.

#### `preprocessing(filename, dataframe, on_loading, video_cat_enc) -> Tuple[pd.DataFrame, Optional[List[str]], Optional[OneHotEncoder]]`

Processes the input data for the machine learning model.

- **Parameters:**
- `filename`: Path to the CSV file containing the data.
- `dataframe`: DataFrame containing the data.
- `on_loading`: A boolean indicating whether the preprocessing is during loading.
- `video_cat_enc`: OneHotEncoder for later preprocessing before predictions.

- **Returns:**
- `df`: Processed DataFrame.
- `model_features`: List of features for the duration model.
- `encoder`: OneHotEncoder for later preprocessing before predictions.

#### `get_category_labels(api_key, region_code, youtube) -> Dict[str, str]`

Retrieves YouTube video category labels.

- **Parameters:**
- `api_key`: Your YouTube Data API key.
- `region_code`: The region code for fetching category labels. Default is 'US'.
- `youtube`: Optional. The YouTube API service object.

- **Returns:** Dictionary mapping category IDs to category labels.

#### `convert_duration_to_seconds(duration) -> int`

Converts YouTube video duration from ISO 8601 format to seconds.

- **Parameters:**
- `duration`: Duration string in ISO 8601 format.

- **Returns:** Duration in seconds.

#### 6. `get_channel_subscriber_count(api, channel_ids) -> Optional[int]`

Retrieves subscriber count for YouTube channels.

- **Parameters:**
- `api`: The YouTube API service object.
- `channel_ids`: List of YouTube channel IDs.

- **Returns:** The subscriber count or None if an error occurs.

### Global Constants:

- `API_KEY (str)`: YouTube Data API key.

And there you have it, the robust and versatile Utility Module, an indispensable part of our Duration Model. Let's give a round of applause for these functions that work tirelessly behind the scenes, making our predictions accurate and our analysis impeccable! 🚀✨
25 changes: 25 additions & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# YouTrend: Unleash the Power of YouTube Trending Analysis

Welcome to YouTrend, your ultimate companion in deciphering the intricacies of YouTube's trending video landscape! Our comprehensive package offers a suite of robust features designed to elevate your understanding and interaction with trending content on the platform.

## Features at a Glance

### 1. [Scraper: Unveiling the Data Goldmine](./scraper.md)
YouTrend's Scraper module is your key to unlocking the vast treasure trove of data from the Google YouTube API. Seamlessly collect detailed information about trending videos, paving the way for a deeper understanding of the trends that captivate audiences.

### 2. [Exploratory Data Analysis: Decoding Trends](./app.md)
Dive into the heart of trending videos with our Exploratory Data Analysis feature. Uncover hidden patterns, identify emerging themes, and visualize key insights through interactive graphics. Gain a profound understanding of what makes a video trend-worthy and stay ahead of the curve.

### 3. [Text to Image: Visualizing Potential](./stable_diffusion.md)
Revolutionize the way you approach video promotion with our Text to Image feature. By analyzing video descriptions and titles, YouTrend suggests compelling images tailored to enhance the visual appeal of your content. Elevate your video thumbnails, increasing the likelihood of catching the eye of potential viewers.

### 4. [Duration Model: Predicting Success](./duration_model.md)
Navigate the complex landscape of trending probabilities with our Duration Model. This sophisticated analysis module models the likelihood of a video entering the trending list and, conversely, the probability of it not gaining traction. Leverage this predictive power to fine-tune your content strategy for maximum impact.

### 5. [App: Your YouTrend Dashboard](./app.md)
Explore the YouTrend application, your interactive dashboard for all things trending on YouTube. Engage with the features seamlessly, making data-driven decisions and optimizing your content strategy for success.

## Elevate Your YouTube Experience
YouTrend isn't just a tool; it's your guide to mastering the art and science of YouTube trending analysis. Whether you're a content creator, marketer, or avid viewer, YouTrend empowers you to stay ahead of trends, create captivating content, and make informed decisions in the dynamic world of YouTube.

Unleash the potential of your content with YouTrend – where analytics meets creativity! Join us on this exciting journey into the heart of YouTube trends.
Loading