🥊 RWF2000 Violence Detection with TSM + MobileNet

A deep learning pipeline for binary violence detection in videos using the RWF-2000 dataset and a Temporal Shift Module (TSM) applied on top of a lightweight MobileNet backbone.

📌 Overview

This project tackles the task of automatically detecting violent behavior in surveillance-style video clips. Each video is classified into one of two categories:

Label	Class
`0`	NonFight
`1`	Fight

The model combines the efficiency of MobileNet with the temporal modeling power of TSM, enabling it to reason across video frames without the heavy computation of 3D convolutions.

📂 Project Structure

RWF2000_TSM/
├── config.py        # All hyperparameters and paths (CFG class)
├── train.py         # Training and evaluation loop
├── data/            # Dataset loading (RWF2000DatasetJPEG)
├── models/          # TSMMobileNet model definition
├── util/            # Frame extraction utilities
└── README.md

🧠 Model Architecture

TSMMobileNet — Temporal Shift Module wrapped around MobileNet:

Backbone: MobileNet (pretrained on ImageNet)
Temporal Module: TSM shifts a portion of channels along the time dimension, allowing the 2D backbone to implicitly capture motion across frames — zero extra parameters
Input format: (B, T×3, H, W) — segments are channel-stacked
Output: 2-class softmax logits (Fight / NonFight)

⚙️ Configuration

All settings live in config.py under the CFG class:

Parameter	Value	Description
`NUM_SEGMENTS`	`8`	Frames sampled per video (T)
`IMG_SIZE`	`224`	Input spatial resolution
`BATCH_SIZE`	`8`	Training batch size
`EPOCHS`	`30`	Number of training epochs
`LR`	`1e-3`	Initial learning rate (AdamW)
`LR_STEPS`	`[5, 15]`	Epoch milestones for LR decay
`LR_GAMMA`	`0.1`	LR decay factor
`WEIGHT_DECAY`	`1e-3`	AdamW weight decay
`NUM_WORKERS`	`2`	DataLoader worker threads
`SEED`	`42`	Reproducibility seed

📦 Dataset

RWF-2000 — A large-scale video dataset for violence detection:

2,000 video clips collected from surveillance cameras
50/50 split between fight and non-fight clips
Train/validation split provided by the dataset

The pipeline pre-extracts frames to JPEG before training for faster I/O:

extract_frames_to_jpeg(
    data_root=CFG.DATA_ROOT,
    out_root=FRAME_ROOT,
    num_segments=CFG.NUM_SEGMENTS,
    img_size=CFG.IMG_SIZE + 32,  # 256px → RandomCrop(224)
    quality=95,
)

Note: The dataset path is configured for Kaggle: /kaggle/input/.../RWF-2000

🚀 Training

python train.py

The training loop uses:

Loss: CrossEntropyLoss
Optimizer: AdamW (lr=1e-3, weight_decay=1e-3)
Epochs: 40 (as run in train.py)
Hardware: CUDA GPU (falls back to CPU automatically)

Training progress is printed every 100 batches. After each epoch, validation accuracy is computed over the full val set.

📊 Results

Metric	Value
Validation Accuracy	72%
Dataset	RWF-2000
Model	TSM + MobileNet
Segments (T)	8
Image Size	224×224

🔧 Requirements

torch
torchvision
numpy
opencv-python   # for frame extraction

Install with:

pip install torch torchvision numpy opencv-python

🗂️ Running on Kaggle

This project is designed to run on Kaggle with GPU acceleration:

Add the RWF-2000 dataset to your Kaggle notebook
Update CFG.DATA_ROOT in config.py if needed (default points to the Kaggle input path)
Run train.py — frames will be extracted to /kaggle/working/rwf2000_frames/
The best model checkpoint is saved to /kaggle/working/tsm_mobilenet_best.pth

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🥊 RWF2000 Violence Detection with TSM + MobileNet

📌 Overview

📂 Project Structure

🧠 Model Architecture

⚙️ Configuration

📦 Dataset

🚀 Training

📊 Results

🔧 Requirements

🗂️ Running on Kaggle

📖 References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
models		models
util		util
.gitignore		.gitignore
README.md		README.md
config.py		config.py
train.py		train.py

Folders and files

Latest commit

History

Repository files navigation

🥊 RWF2000 Violence Detection with TSM + MobileNet

📌 Overview

📂 Project Structure

🧠 Model Architecture

⚙️ Configuration

📦 Dataset

🚀 Training

📊 Results

🔧 Requirements

🗂️ Running on Kaggle

📖 References

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages