| Author | Dawid Siera, Anatol Kaczmarek |
| License | None |
| Frameworks | PyTorch, Streamlit |
| Section | Path |
|---|---|
| Model Architecture | qsr/model.py |
| Dataset Loading | qsr/dataset_loading.py |
| Training Process | qsr/trainer.py |
| Streamlit Pages | pages/ |
| GUI Start Point | main.py |
This repository provides a user-friendly interface and backend to perform video super resolution using temporal convolutional neural networks. By leveraging the temporal dimension, we can enhance consecutive frames in a video to produce higher-resolution output, bridging the gap between a low-res input and crisp HD output.
- Install Dependencies
Note: FFmpeg needs to be installed on the system.
pip install -r requirements.txt- Start the Streamlit GUI along with MLFlow server
sh run.sh-
Navigate to the Training Page
Choose
training page, load your movie, specify hyperparameters and run the training! At this step you can adjust the learning rate, optimizer, and loss to experiment. -
Switch to Prediction
After training, move to the
Predicting page. Select your trained model, upload a video, and generate a high-resolution version.
-
Data Preparation
- Gather training videos in 360p resolution (or your chosen “low-res” setting).
- Ensure you have matching HD or 4K versions for ground truth.
-
Hyperparameters
- Tweak
Frames back/forward,Batch size,Number of epochs, etc. - Experiment with different
Optimizers(Adam,AdamW,SGD) andLossfunctions for best performance.
- Tweak
-
Start Training
- Once you click Start Training in the GUI, the system will begin iterating through epochs.
- Progress is logged in your console and within the Streamlit interface.
-
Monitor Results
- Watch for training loss and visual improvements in intermediate outputs.
- Adjust hyperparameters as necessary.
-
Select a Trained Model
- Pick your best performer from the dropdown (e.g.,
new_TSRCNN_large).
- Pick your best performer from the dropdown (e.g.,
-
Set Input/Output Resolutions
- Example: Input = 360p, Output = 720p.
-
Frames Back/Forward
- A setting of
1means we use one frame behind and ahead to improve the current frame’s detail.
- A setting of
-
Upload a Video
- Drag & Drop your
.mp4or.movup to 200MB (customizable instreamlitconfig).
- Drag & Drop your
-
Upscale Video
- The app processes each frame (and its neighbors) and outputs a higher-resolution video.
- Model: A temporal CNN leveraging neighboring frames to infer missing detail.
- Feature Extraction: Uses convolutional layers to detect spatial features, combined with short-term memory across frames.
- Loss Functions: PNSR, DSSIM (configurable) to reduce pixel-level errors.
- Framework: PyTorch for the backend, Streamlit for easy UI.

