Skip to content
#

temporal-grounding

Here are 11 public repositories matching this topic...

This paper presents the VLMI framework to detect activities in complex videos. It combines Swin Transformer video features with language prompts and an EIoU-based similarity measure, enabling accurate, query-driven activity detection and timestamping, handling visual noise and temporal uncertainty without full manual labeling.

  • Updated Jan 25, 2026
  • Jupyter Notebook

Improve this page

Add a description, image, and links to the temporal-grounding topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the temporal-grounding topic, visit your repo's landing page and select "manage topics."

Learn more