Skip to content

TRAILab/STaR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

STaR: Scalable Task-Conditioned Retrieval for Long-Horizon Multimodal Robot Memory

[arXiv] [Project Page]

STaR System Overview.

STaR System Overview

Task-Conditioned Retrieval via Information Bottleneck and Context-Aware Cross-Modal Reasoning.

Task-conditioned retrieval and contextual reasoning

Overview

STaR is an agentic framework for scalable task-conditioned retrieval and contextual reasoning over long-horizon multimodal robot memory, enabling robots to answer open-ended spatial, temporal, and descriptive queries and to produce precise, actionable outputs for navigation.

Key contributions:

  • Long-horizon multimodal memory (OmniMem) that integrates 3D primitives, temporally aligned video captions, and visual keyframes to support joint spatial, temporal, and semantic reasoning in open-world environments.
  • Scalable Task-Conditioned Retrieval (STaR) based on the Information Bottleneck principle, which distills compact, non-redundant, and task-relevant memory subsets from long-term experience without requiring predefined task lists.
  • Agentic RAG workflow that couples MLLM-based planning with structured memory retrieval and contextual reasoning, enabling accurate question answering and reliable downstream execution for navigation.
  • Extensive evaluation and real-robot deployment, demonstrating state-of-the-art performance on NaVQA and a challenging warehouse benchmark (WH-VQA in Isaac Sim), as well as robust long-horizon reasoning on a real Husky mobile robot.

Code

Code coming soon.

About

STaR: Scalable Task-Conditioned Retrieval for Long-Horizon Multimodal Robot Memory

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors