VidSafe is an AI-based video moderation system that analyzes both visual and audio content to detect harmful elements such as violence and offensive speech. It processes video frames to identify violent regions and analyzes audio to detect toxic language. Based on these detections, the system selectively moderates only the unsafe parts of the video by blurring harmful visuals and censoring inappropriate audio, while keeping the rest of the content unchanged. In addition, the system generates a structured moderation report using predefined policies, providing details such as detected violations, timestamps, and recommended actions for better transparency and decision-making.
- Performs multimodal analysis by processing both video and audio content.
- Detects violent content at the region level within video frames.
- Identifies and censors toxic or offensive speech from audio.
- Applies selective moderation by modifying only unsafe parts instead of removing the entire video.
- Generates policy-based moderation reports with detected violations and timestamps.
The VidSafe system follows a structured pipeline to analyze and moderate video content:
-
Input Processing
The input video is received and prepared for analysis by separating it into visual frames and audio. -
Frame Extraction & Audio Separation
Video frames are extracted at regular intervals, and the audio stream is isolated for independent processing. -
Semantic Filtering
Relevant frames are selected using CLIP based on similarity to predefined prompts, reducing unnecessary computation. -
Violence Detection (Visual Analysis)
Selected frames are processed using RT-DETR to detect regions containing violent content. -
Temporal Consistency Filtering
Detections are refined by ensuring they persist across consecutive frames, improving stability. -
Audio Transcription & Analysis
The audio is converted to text using speech recognition, and the text is analyzed to detect toxic or offensive language. -
Multimodal Alignment
Visual and audio detections are aligned using timestamps to ensure accurate mapping of events. -
Selective Moderation
Detected harmful regions are blurred, and toxic audio segments are censored while preserving the rest of the video. -
Policy-Based Reasoning & Report Generation
Detected violations are evaluated using predefined policies, and a structured moderation report is generated with timestamps and recommended actions.
The system processes input video through visual and audio analysis modules. The extracted information is fused and evaluated using policy-based reasoning to generate a moderated video and a structured report.
- RT-DETR β region-level violence detection
- CLIP β semantic frame filtering
- Faster-Whisper β speech-to-text transcription
- Detoxify β sentence-level toxicity detection
- RoBERTa β word-level toxicity analysis
- LLaMA 3 (via Groq API) β moderation report generation
- PyTorch
- OpenCV
- Hugging Face Transformers
- Streamlit β interactive user interface
- FFmpeg β audio extraction and processing
- FAISS β vector database for policy retrieval
- Python 3.8+
- Google Colab / VS Code
A custom dataset was created for this project due to the lack of publicly available datasets for region-level violence detection in animated content.
- Total frames: 8,401
- Violent: 3,819
- Non-violent: 4,582
- Annotated using CVAT
π Dataset: Anime Violence Detection Dataset
The RT-DETR model was trained on this dataset to perform region-level violence detection. The model learns to identify and localize harmful visual content within video frames, enabling precise moderation.
The trained model is then integrated into the VidSafe pipeline for detecting and moderating unsafe video segments.
The performance of the trained RT-DETR model was evaluated on the custom annotated dataset.
| Metric | Value |
|---|---|
| Precision | 0.690 |
| Recall | 0.559 |
| mAP@50 | 0.622 |
| Metric | Value |
|---|---|
| Accuracy | 80.5% |
| Precision | 0.732 |
| Recall | 0.918 |
| F1 Score | 0.815 |
Original video frame containing violent visual content.
Detected violent regions are blurred, and harmful content is selectively moderated while preserving the rest of the video.
A structured report generated using policy-based reasoning, showing detected violations, timestamps, and recommended actions.
git clone https://github.com/Aparnamol-KS/VidSafe.git
cd vidsafepip install -r requirements.txt
streamlit run app.py
- Extend the system for real-time video moderation
- Introduce age-based or user-specific content filtering
- Expand detection to additional categories (e.g., explicit or sensitive content)
- Improve multimodal fusion for handling complex scenarios
- Optimize the system for deployment on large-scale platforms
Developed as part of a B.Tech AI & DS project.


