Cosmos3 Inverse/Forward Dynamics: how to obtain observation.state (LeRobot-compatible) alongside predicted actions?

I am evaluating __Cosmos3 inverse dynamics__ (video → action) and I would like to build __LeRobot-format__ episodes for downstream training/finetuning (e.g., `observation.images.*`, `observation.state`, `action`, timestamps, etc.).\
Cosmos3 inference currently returns an `action` tensor (e.g. `[T, raw_action_dim]`) but it is unclear how I should obtain the corresponding __state observations__ (`observation.state`) required by LeRobot/robot-learning datasets.
 
---
 
### Context / Goal
 
- I run Cosmos3 inverse dynamics on an input video and obtain:
 
  - `action.data` with shape `[T, raw_action_dim]`
 
- I want to __append new episodes__ to an existing dataset in LeRobot/GR00T format (parquet + meta + mp4), which expects at minimum:
 
  - `observation.images.<cam>` (video frames)
  - `observation.state` (per-timestep state vector)
  - `action` (per-timestep action vector)
  - timestamps / frame_index / episode_index, etc.
 
My blocker: given only (video, predicted action), how should I obtain `observation.state` to create a coherent episode?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cosmos3 Inverse/Forward Dynamics: how to obtain observation.state (LeRobot-compatible) alongside predicted actions? #192

Context / Goal

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Cosmos3 Inverse/Forward Dynamics: how to obtain observation.state (LeRobot-compatible) alongside predicted actions? #192

Description

Context / Goal

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions