You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+33-7Lines changed: 33 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -41,13 +41,6 @@ We present **Video-RAC**, an adaptive chunking methodology for lecture videos wi
41
41
42
42
Alongside the method, we release **EduViQA**, a slide-centric, bilingual (Persian/English) lecture dataset containing 20 videos from 5 professors across STEM and education topics. Each lecture is paired with 50 synthetic QA items and categorized by duration (40% mid-length, ~20–40 minutes) to support controlled RAG benchmarking.
43
43
44
-
**Key Highlights:**
45
-
- ✨ **Adaptive chunking** using CLIP embeddings and SSIM for semantic segmentation
The dataset also captures slide transitions and keyframes extracted via CLIP+SSIM chunking, enabling multimodal retrieval experiments with aligned visuals and transcripts.
80
+
81
+
**📥 Access Dataset:**[Hugging Face - EduViQA](https://huggingface.co/datasets/UIAIC/EduViQA)
82
+
83
+
---
84
+
59
85
## 🧠 Research Background
60
86
61
87
This framework underpins the **EduViQA bilingual dataset**, designed for evaluating lecture-based RAG systems in both Persian and English. The dataset and code form a unified ecosystem for multimodal question generation and retrieval evaluation.
0 commit comments