Skip to content

RonaldSit/robovet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

 _____   ____  ____   ______      ________ _______
|  __ \ / __ \|  _ \ / __ \ \    / /  ____|__   __|
| |__) | |  | | |_) | |  | \ \  / /| |__     | |
|  _  /| |  | |  _ <| |  | |\ \/ / |  __|    | |
| | \ \| |__| | |_) | |__| | \  /  | |____   | |
|_|  \_\\____/|____/ \____/   \/   |______|  |_|

vet your robot datasets — before you waste the training run

PyPI python checks license

Install · Quick start · How to use · 37 checks

You spent an evening teleoperating a robot. Before you spend a GPU-day training on those episodes, spend 30 seconds making sure they aren't lying to you. This is what a lying dataset looks like:

$ robovet doctor ./my_dataset

  FAIL DATA-104   1 episode where metadata 'length' disagrees with the parquet
                  row count — the classic signature of a corrupted episode map.
  FAIL STATS-302  1 stat block disagrees with the actual data — every training
                  run normalizes with these numbers.
  WARN TIME-202   Loading this dataset requires tolerance_s ≥ 7.7e-03
                  (77× the default). Worst: episode 2, 7.29 ms off the grid.
  FAIL META-502   Σ episode lengths = 1086 but info.json total_frames = 1037 —
                  the metadata contradicts itself before a single file is read.

  5 fail · 4 warn · 23 pass
  UNSAFE TO TRAIN — fix the FAILs first.        (exit code 1 — CI-gate it)

Quick start (60 seconds, no robot needed)

pip install "robovet[video]"

robovet demo ./demo      # builds a fake dataset with 10 real-world defects
robovet doctor ./demo    # catches all of them, tells you which episode, exits 1
robovet fix ./demo --apply   # repairs the metadata problems (.bak backups)
robovet doctor ./demo    # the metadata FAILs are gone

Want to see what healthy looks like? robovet demo ./d --clean builds the same dataset with zero defects. There's a v3 flavor too: robovet demo ./d3 --v3.

How you'll actually use it

① You just finished recording. Run robovet doctor ./my_task. Green means train. Red means it tells you exactly which episodes are broken and why — in plain English, with the issue number it reproduces. Most metadata problems are one robovet fix ./my_task --apply away (it backs everything up as .bak first).

② You found a dataset on the Hub and don't want to download 4 GB to find out it's broken.

pip install "robovet[hub]"
robovet doctor hf://lerobot/svla_so100_pickplace

This pulls only the meta/ folder (usually under 1 MB) and cross-checks the dataset's own ledger: does the episode↔frame index math add up, do the counters match, are the per-episode stats stale, do the video time windows fit. The nastiest corruption class (lerobot#2401) is visible from metadata alone. To be clear about what this can't see: values, timestamps and video decoding still need the files, so a remote pass says META CLEAN, never CLEAN. (--meta-only also works on local paths when you want a one-second pre-check.)

③ You want bad data blocked before it reaches your team's training runs. robovet doctor exits 1 on any FAIL, so CI can gate dataset merges the same way Codecov gates coverage:

name: robovet
on: [push, pull_request]
jobs:
  vet:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with: { python-version: "3.12" }
      - run: pip install "robovet[video]"
      - run: robovet doctor ./datasets/my_task   # FAIL blocks the merge

④ You want to drop your worst episodes before training.

robovet score ./my_task --worst 10     # the 10 episodes to look at first
robovet score ./my_task --csv scores.csv

Every episode gets a 0–100 score from cheap, fast signals computed in one pass: jerky motion, long idle stretches, gripper chatter, weird durations, saturated actions, exact duplicates. It's a triage list, not a judge — look at the flagged episodes yourself before deleting anything. (The 2026 curation papers — rinse, Demo-SCORE, QoQ — all argue for exactly this kind of cheap smoothness-first pass before any expensive policy-based filtering.)

And when something goes wrong mid-training, start here:

Saw this error? Run this

You hit Look at You get
ValueError: timestamps … tolerance_s on load TIME-202 the exact minimal tolerance_s, and which episode is worst
wrong frames / IndexError after a v2→v3 conversion DATA-104/105 + META-501 which episodes' ledgers lie, cross-checked three ways
TorchCodec/AV1 decode errors VIDEO-403 per-camera codec tiers and what to re-encode
loss=NaN out of nowhere DATA-107 + STATS-302 NaN/Inf locations and stale normalization stats

Why this exists

Robot learning's bottleneck moved from models to data, and the data is quietly broken. An April 2026 audit of 10 popular open robot datasets found floating-point drift that breaks video decoding after ~45 episodes, a v2.1→v3.0 conversion bug that silently scrambles which frames belong to which episode (training "works" — on jumbled sequences), and datasets that only load with tolerance_s cranked to 100× the default. Hugging Face's own cleanup of community datasets found 111 of 240 failed validation — and that pipeline is internal; you can't run it on yours. Meanwhile everyone agrees a well-curated 500-demo fine-tune beats a sloppy one 10× the size. The missing piece is tooling, and that's what this is.

Every check maps to a documented, real-world failure — the lerobot issue numbers are right there in the table below.

What it checks

Group Catches Maps to
STRUCT-0xx missing/invalid metadata, dangling episodes, orphan files lerobot#761 (no validator for hand-rolled conversions)
DATA-1xx episode↔frame mapping corruption, schema drift, NaN/Inf, dead dims lerobot#2401 (silent v2.1→v3.0 corruption)
TIME-2xx off-grid timestamps with the exact tolerance_s you'd need, non-monotonic time, cumulative FP drift lerobot#933, lerobot#3177
STATS-3xx stored normalization stats that disagree with the data, broken quantile stats (q01/q99) HF docs warning; phospho repair post; lerobot#2189
META-5xx the dataset's ledger contradicting itself — works without downloading the data lerobot#2401 class, caught from metadata alone
VIDEO-4xx video/parquet frame-count desync — including per-episode windows inside shared v3 files, codec tiers (h264 ✓ / AV1 info — it's lerobot's own default / mpeg4-hevc warn), fps mismatch Correll-lab postmortem; phospho notes

What fix will never do to your data

robovet fix is dry-run by default. With --apply it only rewrites metadata — episode lengths, normalization stats, info.json counters. It backs up every file it touches as .bak, it never modifies parquet or video payloads, and it preserves everything it doesn't understand: your quantile keys, image-stat blocks, episode tags. A repair tool must never be the thing that deletes your data, and the test suite enforces every one of these promises. Frame surgery (trimming desynced tails, re-gridding timestamps) is planned under the same rules.

Honest limits

  • v2.0/v2.1 and v3.x are both fully supported for diagnosis (each has its own fixture and tests; v3 gets per-episode video alignment inside shared files plus per-episode stats checks). fix currently repairs v2.x metadata; v3 stats regeneration is planned.
  • robovet doesn't merge, split or delete episodes — lerobot does that natively now. This tool does what the official stack doesn't: deep validation, metadata repair, and quality triage.
  • Local-first. Your data never leaves your disk.

Use it from Python

from robovet import load_dataset, run_doctor, score_dataset

ds  = load_dataset("./my_dataset")
rep = run_doctor(ds)                     # rep.exit_code, rep.results, rep.counts
sc  = score_dataset(ds, scan=rep.scan)   # reuses the same single IO pass

Apache-2.0. Issues and broken-dataset war stories are very welcome — if your dataset breaks in a way robovet doesn't catch, that's exactly the bug report we want.

About

Vet your robot datasets — 37 checks, repair & scoring for LeRobot data. Know it's UNSAFE TO TRAIN before wasting the run. Vets hf:// repos from 82 KB.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors