From 57d2fa422e535155f625b68fbf1a3a58f8713338 Mon Sep 17 00:00:00 2001 From: paularamo Date: Mon, 15 Jun 2026 14:04:48 -0400 Subject: [PATCH 1/4] Improve CONTRIBUTING.md with cookbook structure, quality requirements, and recipe templates Replaces the generic contributing guide with a structured document that reflects the actual cookbooks/cosmos3/ directory layout and establishes clear quality requirements for community-contributed cookbooks: - Cookbook structure section with routing table (reasoner, generator/audiovisual, generator/action, generator/transfer) - Five quality requirements: open-access data, results/expected output, canonical setup, one-click runnable, naming convention - Cookbook README template ready for contributors to copy - Contribution areas table with clear accept/reject criteria - Testing checklist for pre-submission validation --- CONTRIBUTING.md | 256 +++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 219 insertions(+), 37 deletions(-) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index fd6ad5e3..59669a89 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -1,81 +1,263 @@ # Contributing to NVIDIA Cosmos -Thank you for your interest in contributing to NVIDIA Cosmos. This document provides guidelines and instructions for contributing. +Thank you for your interest in contributing to NVIDIA Cosmos. This guide covers how to propose changes, add new cookbooks, and maintain the quality bar we hold for community-facing content. ## Code of Conduct This project adheres to the [NVIDIA Open Source Code of Conduct](https://github.com/NVIDIA/cosmos/blob/main/CODE_OF_CONDUCT.md). By participating, you are expected to uphold this code. Please report unacceptable behavior by filing an issue or contacting [cosmos-license@nvidia.com](mailto:cosmos-license@nvidia.com). +--- + ## How to Contribute ### Reporting Issues -If you encounter a bug or have a feature request, please open an issue on the [GitHub Issues](https://github.com/NVIDIA/cosmos/issues) page. When filing an issue, include: +Open an issue on [GitHub Issues](https://github.com/NVIDIA/cosmos/issues) with: -- A clear and descriptive title -- Steps to reproduce the problem (if applicable) -- Expected behavior vs. actual behavior -- Your environment details (OS, CUDA version, GPU model, Python version) +- A clear, descriptive title +- Steps to reproduce (if applicable) +- Expected vs. actual behavior +- Environment details: OS, CUDA version, GPU model, Python version, `uv` version - Relevant logs or error messages -### Submitting Changes +### Contribution Workflow -1. **Fork the repository** and create a new branch from `main`: +1. **Fork** the repository and create a branch from `main`: - ```shell - git checkout -b your-branch-name + ```bash + git checkout -b cookbook/descriptive-name # or docs/, fix/, benchmark/ ``` -2. **Make your changes.** Ensure your changes follow the project conventions and do not introduce regressions. +2. **Make your changes** following the guidelines below. -3. **Test your changes.** Verify that existing cookbooks and examples still work correctly with your modifications. +3. **Test your changes.** Run your notebook end-to-end on the target GPU. Verify existing cookbooks are unaffected. -4. **Commit your changes** with a clear, descriptive commit message: +4. **Commit** with a clear message: - ```shell - git commit -m "Brief description of the change" + ```bash + git commit -m "Add worker-safety Reasoner cookbook with vLLM backend" ``` -5. **Push to your fork** and open a Pull Request against the `main` branch of the upstream repository. +5. **Push and open a Pull Request** against `main`. ### Pull Request Guidelines - Provide a clear description of what your PR does and why -- Reference any related issues (e.g., `Fixes #123`) -- Keep PRs focused: one logical change per PR -- Ensure your branch is up to date with `main` before submitting -- Be responsive to review feedback +- Reference related issues (e.g., `Fixes #123`) +- One logical change per PR +- Ensure your branch is up to date with `main` +- Respond to review feedback promptly + +--- + +## Cookbook Structure + +The `cookbooks/` directory is organized by **model generation → tower → capability**. Each cookbook is a self-contained directory with a README, one or more runnable notebooks, and supporting assets. + +``` +cookbooks/ +└── cosmos3/ + ├── README.md # Shared setup (all backends) + ├── cosmos3-model-architecture.png + │ + ├── reasoner/ # Reasoner Tower + │ ├── README.md # Reasoner overview + backend table + │ ├── reasoner_prompt_guide.md # Prompt engineering reference + │ ├── run_with_vllm.ipynb # Cookbook: vLLM backend + │ ├── run_with_nim.ipynb # Cookbook: NIM backend + │ ├── run_with_cosmos_framework.ipynb # Cookbook: Framework backend + │ └── assets/ # Images, videos, sample outputs + │ + └── generator/ + ├── audiovisual/ # Generator: T2I, T2V, I2V, audio + │ ├── README.md + │ ├── run_with_diffusers.ipynb + │ ├── run_with_vllm_omni.ipynb + │ ├── run_with_cosmos_framework.ipynb + │ └── assets/ + │ + ├── action/ # Generator: policy, FDM, IDM + │ ├── README.md + │ ├── run_fd_with_cosmos_framework.ipynb + │ ├── run_fd_with_vllm.ipynb + │ ├── run_id_with_cosmos_framework.ipynb + │ ├── run_id_with_vllm.ipynb + │ ├── run_policy_with_cosmos_framework.md + │ └── assets/ + │ + └── transfer/ # Generator: video-to-video transfer + ├── README.md + ├── run_video_transfer_with_cosmos_framework.ipynb + ├── preview_helpers.py + ├── specs/ + └── assets/ +``` + +### Where Does My Cookbook Go? + +| Your cookbook does... | Place it under | +|----------------------|---------------| +| Image/video understanding, VLM, reasoning, grounding | `cookbooks/cosmos3/reasoner/` | +| Text-to-image, text-to-video, image-to-video, audio | `cookbooks/cosmos3/generator/audiovisual/` | +| Robotics policy, forward/inverse dynamics | `cookbooks/cosmos3/generator/action/` | +| Video-to-video style transfer, edge-guided generation | `cookbooks/cosmos3/generator/transfer/` | + +If your cookbook spans multiple towers (e.g., Reasoner analysis → Generator synthesis), create a new directory under `cookbooks/cosmos3/` with a clear name (e.g., `cookbooks/cosmos3/end2end/`). + +--- + +## Cookbook Quality Requirements + +Every cookbook merged into this repo must meet these requirements. Reviewers will check each item. + +### 1. Open-Access Data Only + +- All datasets must be **publicly downloadable** without NVIDIA-internal credentials +- Acceptable sources: HuggingFace Hub (public or gated with free access), public URLs, synthetic data generated in the notebook +- If working with partners, request a **small public subset** for the cookbook example +- Include the dataset license in your README + +**Not acceptable:** Internal S3 buckets, VPN-only URLs, private NFS mounts, datasets requiring paid partner agreements + +### 2. Results / Expected Output + +Every cookbook must include a **Results** section showing what a successful run looks like: + +- **Inference cookbooks:** Sample generated images/videos, text outputs, or action trajectories saved to `assets/` +- **Post-training cookbooks:** Training loss curves, before/after comparison, evaluation metrics +- **Timing benchmarks:** Wall-clock time on the target GPU (e.g., "Cosmos3-Nano T2V: 45s on 1× A100") + +This lets developers validate their own runs against a known-good baseline. + +### 3. Canonical Setup (No Hidden Dependencies) + +- **Do not duplicate setup instructions.** Link to the shared [`cookbooks/cosmos3/README.md`](cookbooks/cosmos3/README.md) for backend installation (Cosmos Framework, Diffusers, vLLM, NIM) +- Your README should only document **cookbook-specific** dependencies beyond the shared setup +- All dependencies must be installable via `uv pip install` or `apt-get` — no manual builds +- Pin specific versions of critical packages when they affect reproducibility + +### 4. One-Click Runnable + +- Each notebook should run **top-to-bottom without manual intervention** +- Use environment variables for configurable paths (`HF_TOKEN`, `COSMOS3_MEDIA_ROOT`, etc.) +- Default to the smallest model size (Cosmos3-Nano) so the widest set of GPUs can run it +- If a cookbook requires a running server (vLLM, NIM), provide the exact launch command in the README and automate the health check in the notebook + +### 5. Naming Convention + +Follow the existing pattern: + +``` +run__with_.ipynb +``` + +Examples: +- `run_with_vllm.ipynb` — generic Reasoner inference via vLLM +- `run_fd_with_cosmos_framework.ipynb` — forward dynamics via Cosmos Framework +- `run_video_transfer_with_cosmos_framework.ipynb` — video transfer via Cosmos Framework + +For markdown-only guides (no notebook): `run__with_.md` + +--- + +## Cookbook README Template + +Each cookbook directory needs a `README.md`. Use this structure: + +```markdown +# [Cookbook Title] + +One-paragraph description of what this cookbook demonstrates and why it matters. + +## What You'll Build + +- Bullet list of concrete outputs (e.g., "Generate a 480p video from a text prompt") + +## Prerequisites + +- Link to [shared setup](../README.md#backend-name) for backend installation +- Any additional cookbook-specific requirements + +## Backends + +| Backend | Notebook | GPU Requirement | +|---------|----------|----------------| +| vLLM | [`run_with_vllm.ipynb`](run_with_vllm.ipynb) | 1× A100 (80 GB) | +| NIM | [`run_with_nim.ipynb`](run_with_nim.ipynb) | 1× A100 (80 GB) | + +## Quick Start + +Minimal steps to go from clone to first result: + + 1. Set up the backend (link) + 2. Run the notebook + 3. Check your outputs in `assets/` + +## Results / Expected Output + +Sample outputs, metrics, and timing benchmarks from a successful run. + +## Dataset + +| Name | Source | License | Size | +|------|--------|---------|------| +| Dataset Name | [HuggingFace link](...) | Apache 2.0 | ~2 GB | +``` + +--- + +## Contribution Areas + +We welcome contributions in these areas: + +| Area | Examples | +|------|---------| +| **New cookbooks** | Domain-specific applications (robotics, AV, healthcare, manufacturing) | +| **New backends** | Additional serving/inference backends for existing cookbooks | +| **Documentation** | README improvements, prompt guides, architecture explanations | +| **Bug fixes** | Notebook fixes, broken links, version compatibility issues | +| **Benchmarks** | Inference timing across GPU configurations (A100, H100, L40S, RTX 4090) | +| **Post-training recipes** | SFT, LoRA, domain adaptation examples with open datasets | + +### What We Won't Merge + +- Cookbooks that depend on internal/proprietary datasets +- Notebooks that require manual mid-run intervention +- Changes that break existing cookbook functionality +- Generated binary files (model weights, large media) — use HuggingFace/external links instead + +--- ## Development Setup ### Prerequisites - Python 3.10 or later -- CUDA 12.8 or 13.x (see [Troubleshooting](README.md#troubleshooting) for version matching) +- CUDA 12.8 or 13.x (see [Troubleshooting](README.md#troubleshooting)) - An NVIDIA GPU with sufficient VRAM for your target workflow -- `uv` >= 0.11.3 (install from [astral.sh/uv](https://astral.sh/uv)) +- `uv` >= 0.11.3 ([astral.sh/uv](https://astral.sh/uv)) +- `git-lfs` installed (`apt-get install git-lfs`) ### Getting Started -1. Clone the repository: +```bash +git clone https://github.com/NVIDIA/cosmos.git +cd cosmos +``` - ```shell - git clone https://github.com/NVIDIA/cosmos.git - cd cosmos - ``` - -2. Set up your environment following the instructions in the [README](README.md). +Follow [cookbooks/cosmos3/README.md](cookbooks/cosmos3/README.md) to set up the backend(s) your cookbook uses. -3. Explore the [cookbooks](cookbooks/) for end-to-end examples of Generator and Reasoner workflows. +### Testing Your Cookbook -## Contribution Areas +Before submitting: -We welcome contributions in the following areas: +1. **Clean run:** Restart your kernel and run all cells top-to-bottom +2. **Minimal GPU:** Test on the smallest supported GPU configuration +3. **No secrets:** Verify no API keys, tokens, or internal paths are committed +4. **Output cells:** Clear large output cells but keep the Results section outputs +5. **File sizes:** Ensure no single file exceeds 10 MB (use git-lfs for larger assets or link externally) -- **Cookbooks and examples:** New notebooks demonstrating Cosmos 3 capabilities -- **Documentation:** Improvements to README, cookbook READMEs, or inline documentation -- **Bug fixes:** Fixes for issues in existing code or documentation -- **Benchmarks:** Additional inference benchmark results across different hardware configurations +--- ## License @@ -83,4 +265,4 @@ By contributing to this project, you agree that your contributions will be licen ## Questions? -If you have questions about contributing, feel free to open an issue or reach out at [cosmos-license@nvidia.com](mailto:cosmos-license@nvidia.com). +If you have questions about contributing, open an issue or reach out at [cosmos-license@nvidia.com](mailto:cosmos-license@nvidia.com). From cd95eccceb30733b1a5b0493fe947f2963c4e119 Mon Sep 17 00:00:00 2001 From: paularamo Date: Mon, 15 Jun 2026 14:18:49 -0400 Subject: [PATCH 2/4] Restructure cookbooks into basic_examples/ to prepare for community contributions Move all existing notebooks, assets, specs, and helpers into basic_examples/ subdirectories under each tower (reasoner, generator/audiovisual, generator/action, generator/transfer). This creates clean top-level directories ready to receive community-contributed cookbooks as sibling folders alongside basic_examples/. - reasoner/basic_examples/: 3 notebooks + prompt guide + 15 media assets - generator/audiovisual/basic_examples/: 3 notebooks + prompts/images - generator/action/basic_examples/: 5 notebooks + action/robot assets - generator/transfer/basic_examples/: 1 notebook + specs + control videos - Updated all 4 tower READMEs with basic_examples/ cookbook tables and paths - Updated CONTRIBUTING.md directory tree to show contribution placement --- CONTRIBUTING.md | 49 +++++++++++------- cookbooks/cosmos3/generator/action/README.md | 40 +++++++++----- .../assets/actions/av_traj_forward.json | 0 .../assets/actions/av_traj_left.json | 0 .../assets/actions/av_traj_right.json | 0 .../assets/actions/umi.json | 0 .../data/chunk-000/file-000.parquet | Bin .../meta/episodes/chunk-000/file-000.parquet | Bin .../droid_lerobot_example/meta/info.json | 0 .../droid_lerobot_example/meta/tasks.parquet | Bin .../chunk-000/file-000.mp4 | Bin .../chunk-000/file-000.mp4 | Bin .../chunk-000/file-000.mp4 | Bin .../assets/images/av_0.jpg | Bin .../assets/images/av_1.jpg | Bin .../assets/images/umi.png | Bin .../assets/videos/av_0.mp4 | Bin .../assets/videos/av_1.mp4 | Bin .../assets/videos/robolab_example_rollout.mp4 | Bin .../assets/videos/umi.mp4 | Bin .../run_fd_with_cosmos_framework.ipynb | 0 .../run_fd_with_vllm.ipynb | 0 .../run_id_with_cosmos_framework.ipynb | 0 .../run_id_with_vllm.ipynb | 0 .../run_policy_with_cosmos_framework.md | 0 .../cosmos3/generator/audiovisual/README.md | 36 ++++++++----- .../assets/images/image2video/car_driving.jpg | Bin .../images/image2video/coastal_road_audio.jpg | Bin .../images/image2video/humanoid_robot.jpg | Bin .../image2video/neg_prompt.json | 0 .../text2video/neg_prompt.json | 0 .../prompts/image2video/car_driving.json | 0 .../image2video/coastal_road_audio.json | 0 .../prompts/image2video/humanoid_robot.json | 0 .../prompts/text2image/robot_draping.json | 0 .../prompts/text2video/car_colliding.json | 0 .../prompts/text2video/robot_kitchen.json | 0 .../text2video/robot_pouring_water_audio.json | 0 .../run_with_cosmos_framework.ipynb | 0 .../run_with_diffusers.ipynb | 0 .../run_with_vllm_omni.ipynb | 0 .../cosmos3/generator/transfer/README.md | 46 ++++++++++------ .../assets/blur/control_blur.mp4 | Bin .../assets/blur/prompt.json | 0 .../assets/depth/control_depth.mp4 | Bin .../assets/depth/prompt.json | 0 .../assets/edge/control_edge.mp4 | Bin .../assets/edge/prompt.json | 0 .../assets/negative_prompt.json | 0 .../assets/seg/control_seg.mp4 | Bin .../assets/seg/prompt.json | 0 .../assets/wsm/control_wsm.mp4 | Bin .../assets/wsm/prompt.json | 0 .../{ => basic_examples}/preview_helpers.py | 0 ...video_transfer_with_cosmos_framework.ipynb | 0 .../{ => basic_examples}/specs/blur.json | 0 .../{ => basic_examples}/specs/depth.json | 0 .../{ => basic_examples}/specs/edge.json | 0 .../{ => basic_examples}/specs/seg.json | 0 .../{ => basic_examples}/specs/wsm.json | 0 cookbooks/cosmos3/reasoner/README.md | 33 ++++++++---- .../assets/action_cot_driving_scene.mp4 | Bin .../assets/action_cot_trajectory.png | Bin .../assets/assisted_task_next_action.mp4 | Bin .../assets/common_sense_reasoning.mp4 | Bin .../assets/describe_anything.png | Bin .../assets/drive_scene_next_action.mp4 | Bin .../assets/grounding_2d.png | Bin .../assets/physical_plausibility.mp4 | Bin .../{ => basic_examples}/assets/robot_153.jpg | Bin .../assets/robot_planning.png | Bin .../assets/robotics_next_action.mp4 | Bin .../assets/situation_understanding.mp4 | Bin .../assets/temporal_localization_1.mp4 | Bin .../assets/temporal_localization_2.mp4 | Bin .../assets/video_caption.mp4 | Bin .../reasoner_prompt_guide.md | 0 .../run_with_cosmos_framework.ipynb | 0 .../{ => basic_examples}/run_with_nim.ipynb | 0 .../{ => basic_examples}/run_with_vllm.ipynb | 0 80 files changed, 134 insertions(+), 70 deletions(-) rename cookbooks/cosmos3/generator/action/{ => basic_examples}/assets/actions/av_traj_forward.json (100%) rename cookbooks/cosmos3/generator/action/{ => basic_examples}/assets/actions/av_traj_left.json (100%) rename cookbooks/cosmos3/generator/action/{ => basic_examples}/assets/actions/av_traj_right.json (100%) rename cookbooks/cosmos3/generator/action/{ => basic_examples}/assets/actions/umi.json (100%) rename cookbooks/cosmos3/generator/action/{ => basic_examples}/assets/droid_lerobot_example/data/chunk-000/file-000.parquet (100%) rename cookbooks/cosmos3/generator/action/{ => basic_examples}/assets/droid_lerobot_example/meta/episodes/chunk-000/file-000.parquet (100%) rename cookbooks/cosmos3/generator/action/{ => basic_examples}/assets/droid_lerobot_example/meta/info.json (100%) rename cookbooks/cosmos3/generator/action/{ => basic_examples}/assets/droid_lerobot_example/meta/tasks.parquet (100%) rename cookbooks/cosmos3/generator/action/{ => basic_examples}/assets/droid_lerobot_example/videos/observation.image.exterior_image_1_left/chunk-000/file-000.mp4 (100%) rename cookbooks/cosmos3/generator/action/{ => basic_examples}/assets/droid_lerobot_example/videos/observation.image.exterior_image_2_left/chunk-000/file-000.mp4 (100%) rename cookbooks/cosmos3/generator/action/{ => basic_examples}/assets/droid_lerobot_example/videos/observation.image.wrist_image_left/chunk-000/file-000.mp4 (100%) rename cookbooks/cosmos3/generator/action/{ => basic_examples}/assets/images/av_0.jpg (100%) rename cookbooks/cosmos3/generator/action/{ => basic_examples}/assets/images/av_1.jpg (100%) rename cookbooks/cosmos3/generator/action/{ => basic_examples}/assets/images/umi.png (100%) rename cookbooks/cosmos3/generator/action/{ => basic_examples}/assets/videos/av_0.mp4 (100%) rename cookbooks/cosmos3/generator/action/{ => basic_examples}/assets/videos/av_1.mp4 (100%) rename cookbooks/cosmos3/generator/action/{ => basic_examples}/assets/videos/robolab_example_rollout.mp4 (100%) rename cookbooks/cosmos3/generator/action/{ => basic_examples}/assets/videos/umi.mp4 (100%) rename cookbooks/cosmos3/generator/action/{ => basic_examples}/run_fd_with_cosmos_framework.ipynb (100%) rename cookbooks/cosmos3/generator/action/{ => basic_examples}/run_fd_with_vllm.ipynb (100%) rename cookbooks/cosmos3/generator/action/{ => basic_examples}/run_id_with_cosmos_framework.ipynb (100%) rename cookbooks/cosmos3/generator/action/{ => basic_examples}/run_id_with_vllm.ipynb (100%) rename cookbooks/cosmos3/generator/action/{ => basic_examples}/run_policy_with_cosmos_framework.md (100%) rename cookbooks/cosmos3/generator/audiovisual/{ => basic_examples}/assets/images/image2video/car_driving.jpg (100%) rename cookbooks/cosmos3/generator/audiovisual/{ => basic_examples}/assets/images/image2video/coastal_road_audio.jpg (100%) rename cookbooks/cosmos3/generator/audiovisual/{ => basic_examples}/assets/images/image2video/humanoid_robot.jpg (100%) rename cookbooks/cosmos3/generator/audiovisual/{ => basic_examples}/assets/negative_prompts/image2video/neg_prompt.json (100%) rename cookbooks/cosmos3/generator/audiovisual/{ => basic_examples}/assets/negative_prompts/text2video/neg_prompt.json (100%) rename cookbooks/cosmos3/generator/audiovisual/{ => basic_examples}/assets/prompts/image2video/car_driving.json (100%) rename cookbooks/cosmos3/generator/audiovisual/{ => basic_examples}/assets/prompts/image2video/coastal_road_audio.json (100%) rename cookbooks/cosmos3/generator/audiovisual/{ => basic_examples}/assets/prompts/image2video/humanoid_robot.json (100%) rename cookbooks/cosmos3/generator/audiovisual/{ => basic_examples}/assets/prompts/text2image/robot_draping.json (100%) rename cookbooks/cosmos3/generator/audiovisual/{ => basic_examples}/assets/prompts/text2video/car_colliding.json (100%) rename cookbooks/cosmos3/generator/audiovisual/{ => basic_examples}/assets/prompts/text2video/robot_kitchen.json (100%) rename cookbooks/cosmos3/generator/audiovisual/{ => basic_examples}/assets/prompts/text2video/robot_pouring_water_audio.json (100%) rename cookbooks/cosmos3/generator/audiovisual/{ => basic_examples}/run_with_cosmos_framework.ipynb (100%) rename cookbooks/cosmos3/generator/audiovisual/{ => basic_examples}/run_with_diffusers.ipynb (100%) rename cookbooks/cosmos3/generator/audiovisual/{ => basic_examples}/run_with_vllm_omni.ipynb (100%) rename cookbooks/cosmos3/generator/transfer/{ => basic_examples}/assets/blur/control_blur.mp4 (100%) rename cookbooks/cosmos3/generator/transfer/{ => basic_examples}/assets/blur/prompt.json (100%) rename cookbooks/cosmos3/generator/transfer/{ => basic_examples}/assets/depth/control_depth.mp4 (100%) rename cookbooks/cosmos3/generator/transfer/{ => basic_examples}/assets/depth/prompt.json (100%) rename cookbooks/cosmos3/generator/transfer/{ => basic_examples}/assets/edge/control_edge.mp4 (100%) rename cookbooks/cosmos3/generator/transfer/{ => basic_examples}/assets/edge/prompt.json (100%) rename cookbooks/cosmos3/generator/transfer/{ => basic_examples}/assets/negative_prompt.json (100%) rename cookbooks/cosmos3/generator/transfer/{ => basic_examples}/assets/seg/control_seg.mp4 (100%) rename cookbooks/cosmos3/generator/transfer/{ => basic_examples}/assets/seg/prompt.json (100%) rename cookbooks/cosmos3/generator/transfer/{ => basic_examples}/assets/wsm/control_wsm.mp4 (100%) rename cookbooks/cosmos3/generator/transfer/{ => basic_examples}/assets/wsm/prompt.json (100%) rename cookbooks/cosmos3/generator/transfer/{ => basic_examples}/preview_helpers.py (100%) rename cookbooks/cosmos3/generator/transfer/{ => basic_examples}/run_video_transfer_with_cosmos_framework.ipynb (100%) rename cookbooks/cosmos3/generator/transfer/{ => basic_examples}/specs/blur.json (100%) rename cookbooks/cosmos3/generator/transfer/{ => basic_examples}/specs/depth.json (100%) rename cookbooks/cosmos3/generator/transfer/{ => basic_examples}/specs/edge.json (100%) rename cookbooks/cosmos3/generator/transfer/{ => basic_examples}/specs/seg.json (100%) rename cookbooks/cosmos3/generator/transfer/{ => basic_examples}/specs/wsm.json (100%) rename cookbooks/cosmos3/reasoner/{ => basic_examples}/assets/action_cot_driving_scene.mp4 (100%) rename cookbooks/cosmos3/reasoner/{ => basic_examples}/assets/action_cot_trajectory.png (100%) rename cookbooks/cosmos3/reasoner/{ => basic_examples}/assets/assisted_task_next_action.mp4 (100%) rename cookbooks/cosmos3/reasoner/{ => basic_examples}/assets/common_sense_reasoning.mp4 (100%) rename cookbooks/cosmos3/reasoner/{ => basic_examples}/assets/describe_anything.png (100%) rename cookbooks/cosmos3/reasoner/{ => basic_examples}/assets/drive_scene_next_action.mp4 (100%) rename cookbooks/cosmos3/reasoner/{ => basic_examples}/assets/grounding_2d.png (100%) rename cookbooks/cosmos3/reasoner/{ => basic_examples}/assets/physical_plausibility.mp4 (100%) rename cookbooks/cosmos3/reasoner/{ => basic_examples}/assets/robot_153.jpg (100%) rename cookbooks/cosmos3/reasoner/{ => basic_examples}/assets/robot_planning.png (100%) rename cookbooks/cosmos3/reasoner/{ => basic_examples}/assets/robotics_next_action.mp4 (100%) rename cookbooks/cosmos3/reasoner/{ => basic_examples}/assets/situation_understanding.mp4 (100%) rename cookbooks/cosmos3/reasoner/{ => basic_examples}/assets/temporal_localization_1.mp4 (100%) rename cookbooks/cosmos3/reasoner/{ => basic_examples}/assets/temporal_localization_2.mp4 (100%) rename cookbooks/cosmos3/reasoner/{ => basic_examples}/assets/video_caption.mp4 (100%) rename cookbooks/cosmos3/reasoner/{ => basic_examples}/reasoner_prompt_guide.md (100%) rename cookbooks/cosmos3/reasoner/{ => basic_examples}/run_with_cosmos_framework.ipynb (100%) rename cookbooks/cosmos3/reasoner/{ => basic_examples}/run_with_nim.ipynb (100%) rename cookbooks/cosmos3/reasoner/{ => basic_examples}/run_with_vllm.ipynb (100%) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 59669a89..48c67c1b 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -62,35 +62,46 @@ cookbooks/ │ ├── reasoner/ # Reasoner Tower │ ├── README.md # Reasoner overview + backend table - │ ├── reasoner_prompt_guide.md # Prompt engineering reference - │ ├── run_with_vllm.ipynb # Cookbook: vLLM backend - │ ├── run_with_nim.ipynb # Cookbook: NIM backend - │ ├── run_with_cosmos_framework.ipynb # Cookbook: Framework backend - │ └── assets/ # Images, videos, sample outputs + │ ├── basic_examples/ # Shipped starter cookbooks + │ │ ├── reasoner_prompt_guide.md + │ │ ├── run_with_vllm.ipynb + │ │ ├── run_with_nim.ipynb + │ │ ├── run_with_cosmos_framework.ipynb + │ │ └── assets/ + │ └── / # ← Community contributions go here + │ ├── README.md + │ ├── run__with_.ipynb + │ └── assets/ │ └── generator/ ├── audiovisual/ # Generator: T2I, T2V, I2V, audio │ ├── README.md - │ ├── run_with_diffusers.ipynb - │ ├── run_with_vllm_omni.ipynb - │ ├── run_with_cosmos_framework.ipynb - │ └── assets/ + │ ├── basic_examples/ # Shipped starter cookbooks + │ │ ├── run_with_diffusers.ipynb + │ │ ├── run_with_vllm_omni.ipynb + │ │ ├── run_with_cosmos_framework.ipynb + │ │ └── assets/ + │ └── / # ← Community contributions go here │ ├── action/ # Generator: policy, FDM, IDM │ ├── README.md - │ ├── run_fd_with_cosmos_framework.ipynb - │ ├── run_fd_with_vllm.ipynb - │ ├── run_id_with_cosmos_framework.ipynb - │ ├── run_id_with_vllm.ipynb - │ ├── run_policy_with_cosmos_framework.md - │ └── assets/ + │ ├── basic_examples/ # Shipped starter cookbooks + │ │ ├── run_fd_with_cosmos_framework.ipynb + │ │ ├── run_fd_with_vllm.ipynb + │ │ ├── run_id_with_cosmos_framework.ipynb + │ │ ├── run_id_with_vllm.ipynb + │ │ ├── run_policy_with_cosmos_framework.md + │ │ └── assets/ + │ └── / # ← Community contributions go here │ └── transfer/ # Generator: video-to-video transfer ├── README.md - ├── run_video_transfer_with_cosmos_framework.ipynb - ├── preview_helpers.py - ├── specs/ - └── assets/ + ├── basic_examples/ # Shipped starter cookbooks + │ ├── run_video_transfer_with_cosmos_framework.ipynb + │ ├── preview_helpers.py + │ ├── specs/ + │ └── assets/ + └── / # ← Community contributions go here ``` ### Where Does My Cookbook Go? diff --git a/cookbooks/cosmos3/generator/action/README.md b/cookbooks/cosmos3/generator/action/README.md index 6158764f..e289f7e5 100644 --- a/cookbooks/cosmos3/generator/action/README.md +++ b/cookbooks/cosmos3/generator/action/README.md @@ -1,8 +1,24 @@ -# Cosmos3 Generator Action Examples +# Cosmos3 Generator Action Cookbooks -Cosmos3-Nano action-generation examples across two inference backends — native -PyTorch (Cosmos Framework) and vLLM-Omni. Both backends use the sample assets -under [`assets/`](./assets) and cover two tasks: +Cosmos3-Nano action-generation cookbooks across two inference backends — native +PyTorch (Cosmos Framework) and vLLM-Omni. + +## Basic Examples + +The [`basic_examples/`](./basic_examples/) directory contains the shipped starter +cookbooks and sample assets. Community-contributed cookbooks are added as sibling +directories alongside `basic_examples/` — see the +[Contributing Guide](../../../../CONTRIBUTING.md) for the recipe structure. + +| Cookbook | Backend | Notebook | +|---------|---------|----------| +| Forward dynamics (AV, DROID, UMI) | Cosmos Framework | [`basic_examples/run_fd_with_cosmos_framework.ipynb`](./basic_examples/run_fd_with_cosmos_framework.ipynb) | +| Inverse dynamics (AV) | Cosmos Framework | [`basic_examples/run_id_with_cosmos_framework.ipynb`](./basic_examples/run_id_with_cosmos_framework.ipynb) | +| Policy (DROID) | Cosmos Framework | [`basic_examples/run_policy_with_cosmos_framework.md`](./basic_examples/run_policy_with_cosmos_framework.md) | +| Forward dynamics (AV, DROID, UMI) | vLLM-Omni | [`basic_examples/run_fd_with_vllm.ipynb`](./basic_examples/run_fd_with_vllm.ipynb) | +| Inverse dynamics (AV) | vLLM-Omni | [`basic_examples/run_id_with_vllm.ipynb`](./basic_examples/run_id_with_vllm.ipynb) | + +Both backends use the sample assets under [`basic_examples/assets/`](./basic_examples/assets/) and cover two tasks: - **Forward dynamics (`fd`)** — predict future observations from a start image plus an action trajectory (AV, DROID, and UMI robotics examples) using the Cosmos3-Nano. @@ -68,7 +84,7 @@ torchrun --nproc-per-node=1 \ The input spec pairs a start image with an action trajectory. The notebooks assemble ready-to-run specs for AV, DROID, and UMI examples from the checked-in -assets under [`assets/`](./assets). Outputs are written under the framework +assets under [`basic_examples/assets/`](./basic_examples/assets/). Outputs are written under the framework checkout. ### Cosmos Framework Walkthrough @@ -76,11 +92,11 @@ checkout. The Cosmos Framework build their input spec, run inference, and visualize the generated videos: -- [`run_fd_with_cosmos_framework.ipynb`](./run_fd_with_cosmos_framework.ipynb) — +- [`run_fd_with_cosmos_framework.ipynb`](./basic_examples/run_fd_with_cosmos_framework.ipynb) — forward dynamics for AV, DROID, and UMI robotics examples using Cosmos3-Nano. -- [`run_id_with_cosmos_framework.ipynb`](./run_id_with_cosmos_framework.ipynb) — +- [`run_id_with_cosmos_framework.ipynb`](./basic_examples/run_id_with_cosmos_framework.ipynb) — inverse dynamics, predicting ego-motion trajectories from input AV videos using Cosmos3-Nano. -- [`run_policy_with_cosmos_framework.md`](./run_policy_with_cosmos_framework.md) - policy, predicting future observations and action trajectories for DROID robot using Cosmos3-Nano-Policy-DROID. +- [`run_policy_with_cosmos_framework.md`](./basic_examples/run_policy_with_cosmos_framework.md) - policy, predicting future observations and action trajectories for DROID robot using Cosmos3-Nano-Policy-DROID. ## Run with vLLM-Omni @@ -100,8 +116,8 @@ curl http://localhost:8001/v1/models Forward-dynamics requests are multipart `POST`s to `/v1/videos` — a start image under `files={"input_reference": ...}` plus an `extra_params` payload carrying the action trajectory. The vLLM notebooks use these diffusion defaults for action -generation (see [`run_fd_with_vllm.ipynb`](./run_fd_with_vllm.ipynb) and -[`run_id_with_vllm.ipynb`](./run_id_with_vllm.ipynb)): +generation (see [`run_fd_with_vllm.ipynb`](./basic_examples/run_fd_with_vllm.ipynb) and +[`run_id_with_vllm.ipynb`](./basic_examples/run_id_with_vllm.ipynb)): | Field | Value | | --- | --- | @@ -117,9 +133,9 @@ including autoregressive chunked generation for the robotics examples. The vLLM-Omni notebooks send requests through the OpenAI-compatible video API and write outputs under `outputs/cosmos3_action_vllm/`: -- [`run_fd_with_vllm.ipynb`](./run_fd_with_vllm.ipynb) — forward dynamics for AV, +- [`run_fd_with_vllm.ipynb`](./basic_examples/run_fd_with_vllm.ipynb) — forward dynamics for AV, DROID, and UMI robotics examples. -- [`run_id_with_vllm.ipynb`](./run_id_with_vllm.ipynb) — inverse dynamics, +- [`run_id_with_vllm.ipynb`](./basic_examples/run_id_with_vllm.ipynb) — inverse dynamics, predicting ego-motion trajectories from input AV videos. diff --git a/cookbooks/cosmos3/generator/action/assets/actions/av_traj_forward.json b/cookbooks/cosmos3/generator/action/basic_examples/assets/actions/av_traj_forward.json similarity index 100% rename from cookbooks/cosmos3/generator/action/assets/actions/av_traj_forward.json rename to cookbooks/cosmos3/generator/action/basic_examples/assets/actions/av_traj_forward.json diff --git a/cookbooks/cosmos3/generator/action/assets/actions/av_traj_left.json b/cookbooks/cosmos3/generator/action/basic_examples/assets/actions/av_traj_left.json similarity index 100% rename from cookbooks/cosmos3/generator/action/assets/actions/av_traj_left.json rename to cookbooks/cosmos3/generator/action/basic_examples/assets/actions/av_traj_left.json diff --git a/cookbooks/cosmos3/generator/action/assets/actions/av_traj_right.json b/cookbooks/cosmos3/generator/action/basic_examples/assets/actions/av_traj_right.json similarity index 100% rename from cookbooks/cosmos3/generator/action/assets/actions/av_traj_right.json rename to cookbooks/cosmos3/generator/action/basic_examples/assets/actions/av_traj_right.json diff --git a/cookbooks/cosmos3/generator/action/assets/actions/umi.json b/cookbooks/cosmos3/generator/action/basic_examples/assets/actions/umi.json similarity index 100% rename from cookbooks/cosmos3/generator/action/assets/actions/umi.json rename to cookbooks/cosmos3/generator/action/basic_examples/assets/actions/umi.json diff --git a/cookbooks/cosmos3/generator/action/assets/droid_lerobot_example/data/chunk-000/file-000.parquet b/cookbooks/cosmos3/generator/action/basic_examples/assets/droid_lerobot_example/data/chunk-000/file-000.parquet similarity index 100% rename from cookbooks/cosmos3/generator/action/assets/droid_lerobot_example/data/chunk-000/file-000.parquet rename to cookbooks/cosmos3/generator/action/basic_examples/assets/droid_lerobot_example/data/chunk-000/file-000.parquet diff --git a/cookbooks/cosmos3/generator/action/assets/droid_lerobot_example/meta/episodes/chunk-000/file-000.parquet b/cookbooks/cosmos3/generator/action/basic_examples/assets/droid_lerobot_example/meta/episodes/chunk-000/file-000.parquet similarity index 100% rename from cookbooks/cosmos3/generator/action/assets/droid_lerobot_example/meta/episodes/chunk-000/file-000.parquet rename to cookbooks/cosmos3/generator/action/basic_examples/assets/droid_lerobot_example/meta/episodes/chunk-000/file-000.parquet diff --git a/cookbooks/cosmos3/generator/action/assets/droid_lerobot_example/meta/info.json b/cookbooks/cosmos3/generator/action/basic_examples/assets/droid_lerobot_example/meta/info.json similarity index 100% rename from cookbooks/cosmos3/generator/action/assets/droid_lerobot_example/meta/info.json rename to cookbooks/cosmos3/generator/action/basic_examples/assets/droid_lerobot_example/meta/info.json diff --git a/cookbooks/cosmos3/generator/action/assets/droid_lerobot_example/meta/tasks.parquet b/cookbooks/cosmos3/generator/action/basic_examples/assets/droid_lerobot_example/meta/tasks.parquet similarity index 100% rename from cookbooks/cosmos3/generator/action/assets/droid_lerobot_example/meta/tasks.parquet rename to cookbooks/cosmos3/generator/action/basic_examples/assets/droid_lerobot_example/meta/tasks.parquet diff --git a/cookbooks/cosmos3/generator/action/assets/droid_lerobot_example/videos/observation.image.exterior_image_1_left/chunk-000/file-000.mp4 b/cookbooks/cosmos3/generator/action/basic_examples/assets/droid_lerobot_example/videos/observation.image.exterior_image_1_left/chunk-000/file-000.mp4 similarity index 100% rename from cookbooks/cosmos3/generator/action/assets/droid_lerobot_example/videos/observation.image.exterior_image_1_left/chunk-000/file-000.mp4 rename to cookbooks/cosmos3/generator/action/basic_examples/assets/droid_lerobot_example/videos/observation.image.exterior_image_1_left/chunk-000/file-000.mp4 diff --git a/cookbooks/cosmos3/generator/action/assets/droid_lerobot_example/videos/observation.image.exterior_image_2_left/chunk-000/file-000.mp4 b/cookbooks/cosmos3/generator/action/basic_examples/assets/droid_lerobot_example/videos/observation.image.exterior_image_2_left/chunk-000/file-000.mp4 similarity index 100% rename from cookbooks/cosmos3/generator/action/assets/droid_lerobot_example/videos/observation.image.exterior_image_2_left/chunk-000/file-000.mp4 rename to cookbooks/cosmos3/generator/action/basic_examples/assets/droid_lerobot_example/videos/observation.image.exterior_image_2_left/chunk-000/file-000.mp4 diff --git a/cookbooks/cosmos3/generator/action/assets/droid_lerobot_example/videos/observation.image.wrist_image_left/chunk-000/file-000.mp4 b/cookbooks/cosmos3/generator/action/basic_examples/assets/droid_lerobot_example/videos/observation.image.wrist_image_left/chunk-000/file-000.mp4 similarity index 100% rename from cookbooks/cosmos3/generator/action/assets/droid_lerobot_example/videos/observation.image.wrist_image_left/chunk-000/file-000.mp4 rename to cookbooks/cosmos3/generator/action/basic_examples/assets/droid_lerobot_example/videos/observation.image.wrist_image_left/chunk-000/file-000.mp4 diff --git a/cookbooks/cosmos3/generator/action/assets/images/av_0.jpg b/cookbooks/cosmos3/generator/action/basic_examples/assets/images/av_0.jpg similarity index 100% rename from cookbooks/cosmos3/generator/action/assets/images/av_0.jpg rename to cookbooks/cosmos3/generator/action/basic_examples/assets/images/av_0.jpg diff --git a/cookbooks/cosmos3/generator/action/assets/images/av_1.jpg b/cookbooks/cosmos3/generator/action/basic_examples/assets/images/av_1.jpg similarity index 100% rename from cookbooks/cosmos3/generator/action/assets/images/av_1.jpg rename to cookbooks/cosmos3/generator/action/basic_examples/assets/images/av_1.jpg diff --git a/cookbooks/cosmos3/generator/action/assets/images/umi.png b/cookbooks/cosmos3/generator/action/basic_examples/assets/images/umi.png similarity index 100% rename from cookbooks/cosmos3/generator/action/assets/images/umi.png rename to cookbooks/cosmos3/generator/action/basic_examples/assets/images/umi.png diff --git a/cookbooks/cosmos3/generator/action/assets/videos/av_0.mp4 b/cookbooks/cosmos3/generator/action/basic_examples/assets/videos/av_0.mp4 similarity index 100% rename from cookbooks/cosmos3/generator/action/assets/videos/av_0.mp4 rename to cookbooks/cosmos3/generator/action/basic_examples/assets/videos/av_0.mp4 diff --git a/cookbooks/cosmos3/generator/action/assets/videos/av_1.mp4 b/cookbooks/cosmos3/generator/action/basic_examples/assets/videos/av_1.mp4 similarity index 100% rename from cookbooks/cosmos3/generator/action/assets/videos/av_1.mp4 rename to cookbooks/cosmos3/generator/action/basic_examples/assets/videos/av_1.mp4 diff --git a/cookbooks/cosmos3/generator/action/assets/videos/robolab_example_rollout.mp4 b/cookbooks/cosmos3/generator/action/basic_examples/assets/videos/robolab_example_rollout.mp4 similarity index 100% rename from cookbooks/cosmos3/generator/action/assets/videos/robolab_example_rollout.mp4 rename to cookbooks/cosmos3/generator/action/basic_examples/assets/videos/robolab_example_rollout.mp4 diff --git a/cookbooks/cosmos3/generator/action/assets/videos/umi.mp4 b/cookbooks/cosmos3/generator/action/basic_examples/assets/videos/umi.mp4 similarity index 100% rename from cookbooks/cosmos3/generator/action/assets/videos/umi.mp4 rename to cookbooks/cosmos3/generator/action/basic_examples/assets/videos/umi.mp4 diff --git a/cookbooks/cosmos3/generator/action/run_fd_with_cosmos_framework.ipynb b/cookbooks/cosmos3/generator/action/basic_examples/run_fd_with_cosmos_framework.ipynb similarity index 100% rename from cookbooks/cosmos3/generator/action/run_fd_with_cosmos_framework.ipynb rename to cookbooks/cosmos3/generator/action/basic_examples/run_fd_with_cosmos_framework.ipynb diff --git a/cookbooks/cosmos3/generator/action/run_fd_with_vllm.ipynb b/cookbooks/cosmos3/generator/action/basic_examples/run_fd_with_vllm.ipynb similarity index 100% rename from cookbooks/cosmos3/generator/action/run_fd_with_vllm.ipynb rename to cookbooks/cosmos3/generator/action/basic_examples/run_fd_with_vllm.ipynb diff --git a/cookbooks/cosmos3/generator/action/run_id_with_cosmos_framework.ipynb b/cookbooks/cosmos3/generator/action/basic_examples/run_id_with_cosmos_framework.ipynb similarity index 100% rename from cookbooks/cosmos3/generator/action/run_id_with_cosmos_framework.ipynb rename to cookbooks/cosmos3/generator/action/basic_examples/run_id_with_cosmos_framework.ipynb diff --git a/cookbooks/cosmos3/generator/action/run_id_with_vllm.ipynb b/cookbooks/cosmos3/generator/action/basic_examples/run_id_with_vllm.ipynb similarity index 100% rename from cookbooks/cosmos3/generator/action/run_id_with_vllm.ipynb rename to cookbooks/cosmos3/generator/action/basic_examples/run_id_with_vllm.ipynb diff --git a/cookbooks/cosmos3/generator/action/run_policy_with_cosmos_framework.md b/cookbooks/cosmos3/generator/action/basic_examples/run_policy_with_cosmos_framework.md similarity index 100% rename from cookbooks/cosmos3/generator/action/run_policy_with_cosmos_framework.md rename to cookbooks/cosmos3/generator/action/basic_examples/run_policy_with_cosmos_framework.md diff --git a/cookbooks/cosmos3/generator/audiovisual/README.md b/cookbooks/cosmos3/generator/audiovisual/README.md index d80adad4..fbe388f4 100644 --- a/cookbooks/cosmos3/generator/audiovisual/README.md +++ b/cookbooks/cosmos3/generator/audiovisual/README.md @@ -1,14 +1,26 @@ -# Cosmos3 Generator Audiovisual Examples +# Cosmos3 Generator Audiovisual Cookbooks Generate images and video (with optional audio) from text or image prompts with -`Cosmos3-Nano` and `Cosmos3-Super`, across three inference backends. Sample -prompts live under [`assets/`](./assets). +`Cosmos3-Nano` and `Cosmos3-Super`, across three inference backends. Environment setup for every backend is centralized in the shared [Cosmos3 cookbooks environment setup](../../README.md) guide; each backend below links to the section you need. The quickstarts are minimal text-to-video examples to get one generation running per backend — run them from this folder. +## Basic Examples + +The [`basic_examples/`](./basic_examples/) directory contains the shipped starter +cookbooks and sample prompts. Community-contributed cookbooks are added as sibling +directories alongside `basic_examples/` — see the +[Contributing Guide](../../../../CONTRIBUTING.md) for the recipe structure. + +| Cookbook | Backend | Notebook | +|---------|---------|----------| +| T2I / T2V / I2V + audio | Cosmos Framework | [`basic_examples/run_with_cosmos_framework.ipynb`](./basic_examples/run_with_cosmos_framework.ipynb) | +| T2I / T2V / I2V + audio | Diffusers | [`basic_examples/run_with_diffusers.ipynb`](./basic_examples/run_with_diffusers.ipynb) | +| T2I / T2V / I2V + audio | vLLM-Omni | [`basic_examples/run_with_vllm_omni.ipynb`](./basic_examples/run_with_vllm_omni.ipynb) | + Generator requires the Guardrail. Request access to the gated [nvidia/Cosmos-1.0-Guardrail](https://huggingface.co/nvidia/Cosmos-1.0-Guardrail) HF repository before running these examples. To disable the guardrail, set @@ -31,12 +43,12 @@ import json from pathlib import Path prompt = json.dumps( - json.load(open("assets/prompts/text2video/robot_kitchen.json")), + json.load(open("basic_examples/assets/prompts/text2video/robot_kitchen.json")), ensure_ascii=True, separators=(",", ":"), ) negative = json.dumps( - json.load(open("assets/negative_prompts/text2video/neg_prompt.json")), + json.load(open("basic_examples/assets/negative_prompts/text2video/neg_prompt.json")), ensure_ascii=True, separators=(",", ":"), ) @@ -72,7 +84,7 @@ more GPUs via `--nproc-per-node`. ### Notebook walkthrough -[`run_with_cosmos_framework.ipynb`](./run_with_cosmos_framework.ipynb) is the full +[`run_with_cosmos_framework.ipynb`](./basic_examples/run_with_cosmos_framework.ipynb) is the full tutorial for the native PyTorch backend: it covers every use case — text-to-image, text-to-video, image-to-video, with audio on or off — and includes the detailed, environment-aware setup and visualization for each generation. @@ -91,8 +103,8 @@ from diffusers import Cosmos3OmniPipeline from diffusers.schedulers.scheduling_unipc_multistep import UniPCMultistepScheduler from diffusers.utils import export_to_video -prompt = json.load(open("assets/prompts/text2video/robot_kitchen.json")) -negative = json.load(open("assets/negative_prompts/text2video/neg_prompt.json")) +prompt = json.load(open("basic_examples/assets/prompts/text2video/robot_kitchen.json")) +negative = json.load(open("basic_examples/assets/negative_prompts/text2video/neg_prompt.json")) pipe = Cosmos3OmniPipeline.from_pretrained( "nvidia/Cosmos3-Nano", torch_dtype=torch.bfloat16, device_map="cuda" @@ -122,7 +134,7 @@ To run **Cosmos3-Super** instead, load the larger checkpoint: ### Notebook walkthrough -[`run_with_diffusers.ipynb`](./run_with_diffusers.ipynb) is the full tutorial for +[`run_with_diffusers.ipynb`](./basic_examples/run_with_diffusers.ipynb) is the full tutorial for the Diffusers backend: it provisions a dedicated venv, then walks through text-to-image, text-to-video, and image-to-video generation (with and without audio) using `Cosmos3OmniPipeline`, including how to preview the generated media. @@ -145,8 +157,8 @@ from pathlib import Path import requests -prompt = json.load(open("assets/prompts/text2video/robot_kitchen.json")) -negative = json.load(open("assets/negative_prompts/text2video/neg_prompt.json")) +prompt = json.load(open("basic_examples/assets/prompts/text2video/robot_kitchen.json")) +negative = json.load(open("basic_examples/assets/negative_prompts/text2video/neg_prompt.json")) response = requests.post( "http://localhost:8000/v1/videos/sync", @@ -179,7 +191,7 @@ For image-to-video, post to the same endpoint with an image under ### Notebook walkthrough -[`run_with_vllm_omni.ipynb`](./run_with_vllm_omni.ipynb) is the full tutorial for +[`run_with_vllm_omni.ipynb`](./basic_examples/run_with_vllm_omni.ipynb) is the full tutorial for the vLLM-Omni backend: it walks through text-to-image, text-to-video, and image-to-video requests with audio on or off. Server launch options (Nano and Super, tensor parallelism, layerwise offload, and CFG-parallel variants) live in diff --git a/cookbooks/cosmos3/generator/audiovisual/assets/images/image2video/car_driving.jpg b/cookbooks/cosmos3/generator/audiovisual/basic_examples/assets/images/image2video/car_driving.jpg similarity index 100% rename from cookbooks/cosmos3/generator/audiovisual/assets/images/image2video/car_driving.jpg rename to cookbooks/cosmos3/generator/audiovisual/basic_examples/assets/images/image2video/car_driving.jpg diff --git a/cookbooks/cosmos3/generator/audiovisual/assets/images/image2video/coastal_road_audio.jpg b/cookbooks/cosmos3/generator/audiovisual/basic_examples/assets/images/image2video/coastal_road_audio.jpg similarity index 100% rename from cookbooks/cosmos3/generator/audiovisual/assets/images/image2video/coastal_road_audio.jpg rename to cookbooks/cosmos3/generator/audiovisual/basic_examples/assets/images/image2video/coastal_road_audio.jpg diff --git a/cookbooks/cosmos3/generator/audiovisual/assets/images/image2video/humanoid_robot.jpg b/cookbooks/cosmos3/generator/audiovisual/basic_examples/assets/images/image2video/humanoid_robot.jpg similarity index 100% rename from cookbooks/cosmos3/generator/audiovisual/assets/images/image2video/humanoid_robot.jpg rename to cookbooks/cosmos3/generator/audiovisual/basic_examples/assets/images/image2video/humanoid_robot.jpg diff --git a/cookbooks/cosmos3/generator/audiovisual/assets/negative_prompts/image2video/neg_prompt.json b/cookbooks/cosmos3/generator/audiovisual/basic_examples/assets/negative_prompts/image2video/neg_prompt.json similarity index 100% rename from cookbooks/cosmos3/generator/audiovisual/assets/negative_prompts/image2video/neg_prompt.json rename to cookbooks/cosmos3/generator/audiovisual/basic_examples/assets/negative_prompts/image2video/neg_prompt.json diff --git a/cookbooks/cosmos3/generator/audiovisual/assets/negative_prompts/text2video/neg_prompt.json b/cookbooks/cosmos3/generator/audiovisual/basic_examples/assets/negative_prompts/text2video/neg_prompt.json similarity index 100% rename from cookbooks/cosmos3/generator/audiovisual/assets/negative_prompts/text2video/neg_prompt.json rename to cookbooks/cosmos3/generator/audiovisual/basic_examples/assets/negative_prompts/text2video/neg_prompt.json diff --git a/cookbooks/cosmos3/generator/audiovisual/assets/prompts/image2video/car_driving.json b/cookbooks/cosmos3/generator/audiovisual/basic_examples/assets/prompts/image2video/car_driving.json similarity index 100% rename from cookbooks/cosmos3/generator/audiovisual/assets/prompts/image2video/car_driving.json rename to cookbooks/cosmos3/generator/audiovisual/basic_examples/assets/prompts/image2video/car_driving.json diff --git a/cookbooks/cosmos3/generator/audiovisual/assets/prompts/image2video/coastal_road_audio.json b/cookbooks/cosmos3/generator/audiovisual/basic_examples/assets/prompts/image2video/coastal_road_audio.json similarity index 100% rename from cookbooks/cosmos3/generator/audiovisual/assets/prompts/image2video/coastal_road_audio.json rename to cookbooks/cosmos3/generator/audiovisual/basic_examples/assets/prompts/image2video/coastal_road_audio.json diff --git a/cookbooks/cosmos3/generator/audiovisual/assets/prompts/image2video/humanoid_robot.json b/cookbooks/cosmos3/generator/audiovisual/basic_examples/assets/prompts/image2video/humanoid_robot.json similarity index 100% rename from cookbooks/cosmos3/generator/audiovisual/assets/prompts/image2video/humanoid_robot.json rename to cookbooks/cosmos3/generator/audiovisual/basic_examples/assets/prompts/image2video/humanoid_robot.json diff --git a/cookbooks/cosmos3/generator/audiovisual/assets/prompts/text2image/robot_draping.json b/cookbooks/cosmos3/generator/audiovisual/basic_examples/assets/prompts/text2image/robot_draping.json similarity index 100% rename from cookbooks/cosmos3/generator/audiovisual/assets/prompts/text2image/robot_draping.json rename to cookbooks/cosmos3/generator/audiovisual/basic_examples/assets/prompts/text2image/robot_draping.json diff --git a/cookbooks/cosmos3/generator/audiovisual/assets/prompts/text2video/car_colliding.json b/cookbooks/cosmos3/generator/audiovisual/basic_examples/assets/prompts/text2video/car_colliding.json similarity index 100% rename from cookbooks/cosmos3/generator/audiovisual/assets/prompts/text2video/car_colliding.json rename to cookbooks/cosmos3/generator/audiovisual/basic_examples/assets/prompts/text2video/car_colliding.json diff --git a/cookbooks/cosmos3/generator/audiovisual/assets/prompts/text2video/robot_kitchen.json b/cookbooks/cosmos3/generator/audiovisual/basic_examples/assets/prompts/text2video/robot_kitchen.json similarity index 100% rename from cookbooks/cosmos3/generator/audiovisual/assets/prompts/text2video/robot_kitchen.json rename to cookbooks/cosmos3/generator/audiovisual/basic_examples/assets/prompts/text2video/robot_kitchen.json diff --git a/cookbooks/cosmos3/generator/audiovisual/assets/prompts/text2video/robot_pouring_water_audio.json b/cookbooks/cosmos3/generator/audiovisual/basic_examples/assets/prompts/text2video/robot_pouring_water_audio.json similarity index 100% rename from cookbooks/cosmos3/generator/audiovisual/assets/prompts/text2video/robot_pouring_water_audio.json rename to cookbooks/cosmos3/generator/audiovisual/basic_examples/assets/prompts/text2video/robot_pouring_water_audio.json diff --git a/cookbooks/cosmos3/generator/audiovisual/run_with_cosmos_framework.ipynb b/cookbooks/cosmos3/generator/audiovisual/basic_examples/run_with_cosmos_framework.ipynb similarity index 100% rename from cookbooks/cosmos3/generator/audiovisual/run_with_cosmos_framework.ipynb rename to cookbooks/cosmos3/generator/audiovisual/basic_examples/run_with_cosmos_framework.ipynb diff --git a/cookbooks/cosmos3/generator/audiovisual/run_with_diffusers.ipynb b/cookbooks/cosmos3/generator/audiovisual/basic_examples/run_with_diffusers.ipynb similarity index 100% rename from cookbooks/cosmos3/generator/audiovisual/run_with_diffusers.ipynb rename to cookbooks/cosmos3/generator/audiovisual/basic_examples/run_with_diffusers.ipynb diff --git a/cookbooks/cosmos3/generator/audiovisual/run_with_vllm_omni.ipynb b/cookbooks/cosmos3/generator/audiovisual/basic_examples/run_with_vllm_omni.ipynb similarity index 100% rename from cookbooks/cosmos3/generator/audiovisual/run_with_vllm_omni.ipynb rename to cookbooks/cosmos3/generator/audiovisual/basic_examples/run_with_vllm_omni.ipynb diff --git a/cookbooks/cosmos3/generator/transfer/README.md b/cookbooks/cosmos3/generator/transfer/README.md index 0c477056..8c849185 100644 --- a/cookbooks/cosmos3/generator/transfer/README.md +++ b/cookbooks/cosmos3/generator/transfer/README.md @@ -1,7 +1,19 @@ -# Cosmos3 Generator Transfer Examples +# Cosmos3 Generator Transfer Cookbooks -Cosmos3-Nano video **transfer** examples on the native PyTorch (Cosmos Framework) path. -Sample assets under [`assets/`](./assets) cover spatial control signals paired with +Cosmos3-Nano video **transfer** cookbooks on the native PyTorch (Cosmos Framework) path. + +## Basic Examples + +The [`basic_examples/`](./basic_examples/) directory contains the shipped starter +cookbook and sample assets. Community-contributed cookbooks are added as sibling +directories alongside `basic_examples/` — see the +[Contributing Guide](../../../../CONTRIBUTING.md) for the recipe structure. + +| Cookbook | Backend | Notebook | +|---------|---------|----------| +| Video transfer (edge, blur, depth, seg, wsm) | Cosmos Framework | [`basic_examples/run_video_transfer_with_cosmos_framework.ipynb`](./basic_examples/run_video_transfer_with_cosmos_framework.ipynb) | + +Sample assets under [`basic_examples/assets/`](./basic_examples/assets/) cover spatial control signals paired with `prompt.json` files: - **Edge (Canny)** — edge map control plus caption. @@ -26,11 +38,11 @@ come from the control video; see the spec field reference for how `fps` and | Control | Asset folder | Inference input | Generation duration | | --- | --- | --- | --- | -| Edge (Canny) | `assets/edge/` | `control_edge.mp4` + `prompt.json` | 121 frames @ 30 FPS | -| Blur | `assets/blur/` | `control_blur.mp4` + `prompt.json` | 121 frames @ 30 FPS | -| Depth | `assets/depth/` | `control_depth.mp4` + `prompt.json` | 121 frames @ 30 FPS | -| Segmentation | `assets/seg/` | `control_seg.mp4` + `prompt.json` | 121 frames @ 30 FPS | -| World scenario (WSM) | `assets/wsm/` | `control_wsm.mp4` + `prompt.json` | 101 frames @ 10 FPS | +| Edge (Canny) | `basic_examples/assets/edge/` | `control_edge.mp4` + `prompt.json` | 121 frames @ 30 FPS | +| Blur | `basic_examples/assets/blur/` | `control_blur.mp4` + `prompt.json` | 121 frames @ 30 FPS | +| Depth | `basic_examples/assets/depth/` | `control_depth.mp4` + `prompt.json` | 121 frames @ 30 FPS | +| Segmentation | `basic_examples/assets/seg/` | `control_seg.mp4` + `prompt.json` | 121 frames @ 30 FPS | +| World scenario (WSM) | `basic_examples/assets/wsm/` | `control_wsm.mp4` + `prompt.json` | 101 frames @ 10 FPS | Transfer inference is selected automatically when any hint key is present in the spec. @@ -40,7 +52,7 @@ Transfer inference is selected automatically when any hint key is present in the Set up the environment: [Cosmos Framework setup](../../README.md#cosmos-framework). Activate the framework venv, then run inference (checked-in `specs/*.json` use paths -relative to `specs/`). Transfer on Nano looks like: +relative to `basic_examples/specs/`). Transfer on Nano looks like: ```bash cd cookbooks/cosmos3/generator/transfer @@ -49,7 +61,7 @@ cd cookbooks/cosmos3/generator/transfer torchrun --nproc-per-node=1 \ -m cosmos_framework.scripts.inference \ --parallelism-preset=latency \ - -i specs/edge.json \ + -i basic_examples/specs/edge.json \ -o ./output/ \ --checkpoint-path Cosmos3-Nano \ --seed 2026 @@ -58,7 +70,7 @@ torchrun --nproc-per-node=1 \ torchrun --nproc-per-node=1 \ -m cosmos_framework.scripts.inference \ --parallelism-preset=latency \ - -i specs/blur.json \ + -i basic_examples/specs/blur.json \ -o ./output/ \ --checkpoint-path Cosmos3-Nano \ --seed 2026 @@ -67,7 +79,7 @@ torchrun --nproc-per-node=1 \ torchrun --nproc-per-node=1 \ -m cosmos_framework.scripts.inference \ --parallelism-preset=latency \ - -i specs/depth.json \ + -i basic_examples/specs/depth.json \ -o ./output/ \ --checkpoint-path Cosmos3-Nano \ --seed 2026 @@ -76,7 +88,7 @@ torchrun --nproc-per-node=1 \ torchrun --nproc-per-node=1 \ -m cosmos_framework.scripts.inference \ --parallelism-preset=latency \ - -i specs/seg.json \ + -i basic_examples/specs/seg.json \ -o ./output/ \ --checkpoint-path Cosmos3-Nano \ --seed 2026 @@ -85,14 +97,14 @@ torchrun --nproc-per-node=1 \ torchrun --nproc-per-node=1 \ -m cosmos_framework.scripts.inference \ --parallelism-preset=latency \ - -i specs/wsm.json \ + -i basic_examples/specs/wsm.json \ -o ./output/ \ --checkpoint-path Cosmos3-Nano \ --seed 2026 ``` The input spec sets `prompt_path` and a hint block with `control_path` pointing at the -checked-in assets under [`assets/`](./assets) via paths relative to [`specs/`](./specs). +checked-in assets under [`basic_examples/assets/`](./basic_examples/assets/) via paths relative to [`basic_examples/specs/`](./basic_examples/specs/). Outputs are written under the directory passed to `-o`, with one subdirectory per sample name, for example `output/transfer_edge/vision.mp4`. Batch size must be 1 for transfer. @@ -137,10 +149,10 @@ Key fields: ### Cookbook entrypoints -- [`run_video_transfer_with_cosmos_framework.ipynb`](./run_video_transfer_with_cosmos_framework.ipynb) — +- [`run_video_transfer_with_cosmos_framework.ipynb`](./basic_examples/run_video_transfer_with_cosmos_framework.ipynb) — full tutorial on a **GPU host**: environment setup, `nvidia-smi` check, then five inference blocks (edge, blur, depth, seg, wsm) with previews. See [Cosmos3 environment setup](../../README.md). -- [`specs/`](./specs) — checked-in Framework input JSON per control (paths relative to `specs/`). +- [`basic_examples/specs/`](./basic_examples/specs/) — checked-in Framework input JSON per control (paths relative to `basic_examples/specs/`). ### Troubleshooting diff --git a/cookbooks/cosmos3/generator/transfer/assets/blur/control_blur.mp4 b/cookbooks/cosmos3/generator/transfer/basic_examples/assets/blur/control_blur.mp4 similarity index 100% rename from cookbooks/cosmos3/generator/transfer/assets/blur/control_blur.mp4 rename to cookbooks/cosmos3/generator/transfer/basic_examples/assets/blur/control_blur.mp4 diff --git a/cookbooks/cosmos3/generator/transfer/assets/blur/prompt.json b/cookbooks/cosmos3/generator/transfer/basic_examples/assets/blur/prompt.json similarity index 100% rename from cookbooks/cosmos3/generator/transfer/assets/blur/prompt.json rename to cookbooks/cosmos3/generator/transfer/basic_examples/assets/blur/prompt.json diff --git a/cookbooks/cosmos3/generator/transfer/assets/depth/control_depth.mp4 b/cookbooks/cosmos3/generator/transfer/basic_examples/assets/depth/control_depth.mp4 similarity index 100% rename from cookbooks/cosmos3/generator/transfer/assets/depth/control_depth.mp4 rename to cookbooks/cosmos3/generator/transfer/basic_examples/assets/depth/control_depth.mp4 diff --git a/cookbooks/cosmos3/generator/transfer/assets/depth/prompt.json b/cookbooks/cosmos3/generator/transfer/basic_examples/assets/depth/prompt.json similarity index 100% rename from cookbooks/cosmos3/generator/transfer/assets/depth/prompt.json rename to cookbooks/cosmos3/generator/transfer/basic_examples/assets/depth/prompt.json diff --git a/cookbooks/cosmos3/generator/transfer/assets/edge/control_edge.mp4 b/cookbooks/cosmos3/generator/transfer/basic_examples/assets/edge/control_edge.mp4 similarity index 100% rename from cookbooks/cosmos3/generator/transfer/assets/edge/control_edge.mp4 rename to cookbooks/cosmos3/generator/transfer/basic_examples/assets/edge/control_edge.mp4 diff --git a/cookbooks/cosmos3/generator/transfer/assets/edge/prompt.json b/cookbooks/cosmos3/generator/transfer/basic_examples/assets/edge/prompt.json similarity index 100% rename from cookbooks/cosmos3/generator/transfer/assets/edge/prompt.json rename to cookbooks/cosmos3/generator/transfer/basic_examples/assets/edge/prompt.json diff --git a/cookbooks/cosmos3/generator/transfer/assets/negative_prompt.json b/cookbooks/cosmos3/generator/transfer/basic_examples/assets/negative_prompt.json similarity index 100% rename from cookbooks/cosmos3/generator/transfer/assets/negative_prompt.json rename to cookbooks/cosmos3/generator/transfer/basic_examples/assets/negative_prompt.json diff --git a/cookbooks/cosmos3/generator/transfer/assets/seg/control_seg.mp4 b/cookbooks/cosmos3/generator/transfer/basic_examples/assets/seg/control_seg.mp4 similarity index 100% rename from cookbooks/cosmos3/generator/transfer/assets/seg/control_seg.mp4 rename to cookbooks/cosmos3/generator/transfer/basic_examples/assets/seg/control_seg.mp4 diff --git a/cookbooks/cosmos3/generator/transfer/assets/seg/prompt.json b/cookbooks/cosmos3/generator/transfer/basic_examples/assets/seg/prompt.json similarity index 100% rename from cookbooks/cosmos3/generator/transfer/assets/seg/prompt.json rename to cookbooks/cosmos3/generator/transfer/basic_examples/assets/seg/prompt.json diff --git a/cookbooks/cosmos3/generator/transfer/assets/wsm/control_wsm.mp4 b/cookbooks/cosmos3/generator/transfer/basic_examples/assets/wsm/control_wsm.mp4 similarity index 100% rename from cookbooks/cosmos3/generator/transfer/assets/wsm/control_wsm.mp4 rename to cookbooks/cosmos3/generator/transfer/basic_examples/assets/wsm/control_wsm.mp4 diff --git a/cookbooks/cosmos3/generator/transfer/assets/wsm/prompt.json b/cookbooks/cosmos3/generator/transfer/basic_examples/assets/wsm/prompt.json similarity index 100% rename from cookbooks/cosmos3/generator/transfer/assets/wsm/prompt.json rename to cookbooks/cosmos3/generator/transfer/basic_examples/assets/wsm/prompt.json diff --git a/cookbooks/cosmos3/generator/transfer/preview_helpers.py b/cookbooks/cosmos3/generator/transfer/basic_examples/preview_helpers.py similarity index 100% rename from cookbooks/cosmos3/generator/transfer/preview_helpers.py rename to cookbooks/cosmos3/generator/transfer/basic_examples/preview_helpers.py diff --git a/cookbooks/cosmos3/generator/transfer/run_video_transfer_with_cosmos_framework.ipynb b/cookbooks/cosmos3/generator/transfer/basic_examples/run_video_transfer_with_cosmos_framework.ipynb similarity index 100% rename from cookbooks/cosmos3/generator/transfer/run_video_transfer_with_cosmos_framework.ipynb rename to cookbooks/cosmos3/generator/transfer/basic_examples/run_video_transfer_with_cosmos_framework.ipynb diff --git a/cookbooks/cosmos3/generator/transfer/specs/blur.json b/cookbooks/cosmos3/generator/transfer/basic_examples/specs/blur.json similarity index 100% rename from cookbooks/cosmos3/generator/transfer/specs/blur.json rename to cookbooks/cosmos3/generator/transfer/basic_examples/specs/blur.json diff --git a/cookbooks/cosmos3/generator/transfer/specs/depth.json b/cookbooks/cosmos3/generator/transfer/basic_examples/specs/depth.json similarity index 100% rename from cookbooks/cosmos3/generator/transfer/specs/depth.json rename to cookbooks/cosmos3/generator/transfer/basic_examples/specs/depth.json diff --git a/cookbooks/cosmos3/generator/transfer/specs/edge.json b/cookbooks/cosmos3/generator/transfer/basic_examples/specs/edge.json similarity index 100% rename from cookbooks/cosmos3/generator/transfer/specs/edge.json rename to cookbooks/cosmos3/generator/transfer/basic_examples/specs/edge.json diff --git a/cookbooks/cosmos3/generator/transfer/specs/seg.json b/cookbooks/cosmos3/generator/transfer/basic_examples/specs/seg.json similarity index 100% rename from cookbooks/cosmos3/generator/transfer/specs/seg.json rename to cookbooks/cosmos3/generator/transfer/basic_examples/specs/seg.json diff --git a/cookbooks/cosmos3/generator/transfer/specs/wsm.json b/cookbooks/cosmos3/generator/transfer/basic_examples/specs/wsm.json similarity index 100% rename from cookbooks/cosmos3/generator/transfer/specs/wsm.json rename to cookbooks/cosmos3/generator/transfer/basic_examples/specs/wsm.json diff --git a/cookbooks/cosmos3/reasoner/README.md b/cookbooks/cosmos3/reasoner/README.md index b3d3542d..3718e45a 100644 --- a/cookbooks/cosmos3/reasoner/README.md +++ b/cookbooks/cosmos3/reasoner/README.md @@ -1,15 +1,28 @@ -# Cosmos3 Reasoner Examples +# Cosmos3 Reasoner Cookbooks Run the Cosmos3 Reasoner (vision-language reasoning over images and video) across -multiple inference backends. Sample inputs live under [`assets/`](./assets). +multiple inference backends. Environment setup for every backend is centralized in the shared [Cosmos3 cookbooks environment setup](../README.md) guide; each backend below links to the section you need. +## Basic Examples + +The [`basic_examples/`](./basic_examples/) directory contains the shipped starter +cookbooks and sample inputs. Community-contributed cookbooks are added as sibling +directories alongside `basic_examples/` — see the +[Contributing Guide](../../../CONTRIBUTING.md) for the recipe structure. + +| Cookbook | Backend | Notebook | +|---------|---------|----------| +| Reasoner inference | Cosmos Framework | [`basic_examples/run_with_cosmos_framework.ipynb`](./basic_examples/run_with_cosmos_framework.ipynb) | +| Reasoner inference | vLLM | [`basic_examples/run_with_vllm.ipynb`](./basic_examples/run_with_vllm.ipynb) | +| Reasoner inference | NIM | [`basic_examples/run_with_nim.ipynb`](./basic_examples/run_with_nim.ipynb) | + ## Reasoner Prompt Guide -See the [Reasoner Prompt Guide](./reasoner_prompt_guide.md). +See the [Reasoner Prompt Guide](./basic_examples/reasoner_prompt_guide.md). ## Run with Cosmos Framework @@ -29,7 +42,7 @@ cat > outputs/cookbooks/cosmos3/reasoner/inputs/robot_image.json <<'JSON' "model_mode": "reasoner", "name": "robot_image", "prompt": "Describe what is happening in this image in one sentence.", - "vision_path": "../../cookbooks/cosmos3/reasoner/assets/robot_153.jpg", + "vision_path": "../../cookbooks/cosmos3/reasoner/basic_examples/assets/robot_153.jpg", "enable_sound": false } JSON @@ -54,7 +67,7 @@ The generated text is written to ### Notebook walkthrough -[`run_with_cosmos_framework.ipynb`](./run_with_cosmos_framework.ipynb) is the full +[`run_with_cosmos_framework.ipynb`](./basic_examples/run_with_cosmos_framework.ipynb) is the full tutorial. It writes text and image smoke tests, then walks through image capability sections — detailed captioning, robot task planning, 2D grounding, describe-anything, and action-trajectory prompts — rendering the prompt, media @@ -73,7 +86,7 @@ Set up the environment and start the server: [Start the server](../README.md#start-the-server) (launch commands). The quickstart below uses **Cosmos3-Nano** on port 8000. The -[`run_with_vllm.ipynb`](./run_with_vllm.ipynb) notebook defaults to +[`run_with_vllm.ipynb`](./basic_examples/run_with_vllm.ipynb) notebook defaults to **Cosmos3-Super** on port **8001** — use that launch command from the env setup guide and point the client at `http://localhost:8001/v1`. @@ -83,7 +96,7 @@ Once the server is ready, query it with the OpenAI client: from pathlib import Path import openai -image_path = Path("assets/robot_153.jpg").resolve() +image_path = Path("basic_examples/assets/robot_153.jpg").resolve() image_url = image_path.as_uri() client = openai.OpenAI(api_key="EMPTY", base_url="http://localhost:8000/v1") @@ -108,7 +121,7 @@ print(response.choices[0].message.content) ### Notebook walkthrough -[`run_with_vllm.ipynb`](./run_with_vllm.ipynb) uses the **Cosmos3-Super** launch +[`run_with_vllm.ipynb`](./basic_examples/run_with_vllm.ipynb) uses the **Cosmos3-Super** launch from the [environment setup guide](../README.md#start-the-server) and walks through many more image and video examples: detailed captioning, VQA, temporal localization, embodied reasoning, common-sense reasoning, 2D @@ -138,7 +151,7 @@ import mimetypes from pathlib import Path import openai -image_path = Path("assets/robot_153.jpg").resolve() +image_path = Path("basic_examples/assets/robot_153.jpg").resolve() mime = mimetypes.guess_type(image_path.name)[0] or "application/octet-stream" image_url = f"data:{mime};base64,{base64.b64encode(image_path.read_bytes()).decode('ascii')}" @@ -170,7 +183,7 @@ for the full request reference. ### Notebook walkthrough -[`run_with_nim.ipynb`](./run_with_nim.ipynb) is the NIM counterpart to the vLLM +[`run_with_nim.ipynb`](./basic_examples/run_with_nim.ipynb) is the NIM counterpart to the vLLM notebook: it launches the NIM container, waits for readiness, and then runs the same image and video examples — detailed captioning, VQA, temporal localization, embodied reasoning, common-sense reasoning, 2D grounding, describe-anything, diff --git a/cookbooks/cosmos3/reasoner/assets/action_cot_driving_scene.mp4 b/cookbooks/cosmos3/reasoner/basic_examples/assets/action_cot_driving_scene.mp4 similarity index 100% rename from cookbooks/cosmos3/reasoner/assets/action_cot_driving_scene.mp4 rename to cookbooks/cosmos3/reasoner/basic_examples/assets/action_cot_driving_scene.mp4 diff --git a/cookbooks/cosmos3/reasoner/assets/action_cot_trajectory.png b/cookbooks/cosmos3/reasoner/basic_examples/assets/action_cot_trajectory.png similarity index 100% rename from cookbooks/cosmos3/reasoner/assets/action_cot_trajectory.png rename to cookbooks/cosmos3/reasoner/basic_examples/assets/action_cot_trajectory.png diff --git a/cookbooks/cosmos3/reasoner/assets/assisted_task_next_action.mp4 b/cookbooks/cosmos3/reasoner/basic_examples/assets/assisted_task_next_action.mp4 similarity index 100% rename from cookbooks/cosmos3/reasoner/assets/assisted_task_next_action.mp4 rename to cookbooks/cosmos3/reasoner/basic_examples/assets/assisted_task_next_action.mp4 diff --git a/cookbooks/cosmos3/reasoner/assets/common_sense_reasoning.mp4 b/cookbooks/cosmos3/reasoner/basic_examples/assets/common_sense_reasoning.mp4 similarity index 100% rename from cookbooks/cosmos3/reasoner/assets/common_sense_reasoning.mp4 rename to cookbooks/cosmos3/reasoner/basic_examples/assets/common_sense_reasoning.mp4 diff --git a/cookbooks/cosmos3/reasoner/assets/describe_anything.png b/cookbooks/cosmos3/reasoner/basic_examples/assets/describe_anything.png similarity index 100% rename from cookbooks/cosmos3/reasoner/assets/describe_anything.png rename to cookbooks/cosmos3/reasoner/basic_examples/assets/describe_anything.png diff --git a/cookbooks/cosmos3/reasoner/assets/drive_scene_next_action.mp4 b/cookbooks/cosmos3/reasoner/basic_examples/assets/drive_scene_next_action.mp4 similarity index 100% rename from cookbooks/cosmos3/reasoner/assets/drive_scene_next_action.mp4 rename to cookbooks/cosmos3/reasoner/basic_examples/assets/drive_scene_next_action.mp4 diff --git a/cookbooks/cosmos3/reasoner/assets/grounding_2d.png b/cookbooks/cosmos3/reasoner/basic_examples/assets/grounding_2d.png similarity index 100% rename from cookbooks/cosmos3/reasoner/assets/grounding_2d.png rename to cookbooks/cosmos3/reasoner/basic_examples/assets/grounding_2d.png diff --git a/cookbooks/cosmos3/reasoner/assets/physical_plausibility.mp4 b/cookbooks/cosmos3/reasoner/basic_examples/assets/physical_plausibility.mp4 similarity index 100% rename from cookbooks/cosmos3/reasoner/assets/physical_plausibility.mp4 rename to cookbooks/cosmos3/reasoner/basic_examples/assets/physical_plausibility.mp4 diff --git a/cookbooks/cosmos3/reasoner/assets/robot_153.jpg b/cookbooks/cosmos3/reasoner/basic_examples/assets/robot_153.jpg similarity index 100% rename from cookbooks/cosmos3/reasoner/assets/robot_153.jpg rename to cookbooks/cosmos3/reasoner/basic_examples/assets/robot_153.jpg diff --git a/cookbooks/cosmos3/reasoner/assets/robot_planning.png b/cookbooks/cosmos3/reasoner/basic_examples/assets/robot_planning.png similarity index 100% rename from cookbooks/cosmos3/reasoner/assets/robot_planning.png rename to cookbooks/cosmos3/reasoner/basic_examples/assets/robot_planning.png diff --git a/cookbooks/cosmos3/reasoner/assets/robotics_next_action.mp4 b/cookbooks/cosmos3/reasoner/basic_examples/assets/robotics_next_action.mp4 similarity index 100% rename from cookbooks/cosmos3/reasoner/assets/robotics_next_action.mp4 rename to cookbooks/cosmos3/reasoner/basic_examples/assets/robotics_next_action.mp4 diff --git a/cookbooks/cosmos3/reasoner/assets/situation_understanding.mp4 b/cookbooks/cosmos3/reasoner/basic_examples/assets/situation_understanding.mp4 similarity index 100% rename from cookbooks/cosmos3/reasoner/assets/situation_understanding.mp4 rename to cookbooks/cosmos3/reasoner/basic_examples/assets/situation_understanding.mp4 diff --git a/cookbooks/cosmos3/reasoner/assets/temporal_localization_1.mp4 b/cookbooks/cosmos3/reasoner/basic_examples/assets/temporal_localization_1.mp4 similarity index 100% rename from cookbooks/cosmos3/reasoner/assets/temporal_localization_1.mp4 rename to cookbooks/cosmos3/reasoner/basic_examples/assets/temporal_localization_1.mp4 diff --git a/cookbooks/cosmos3/reasoner/assets/temporal_localization_2.mp4 b/cookbooks/cosmos3/reasoner/basic_examples/assets/temporal_localization_2.mp4 similarity index 100% rename from cookbooks/cosmos3/reasoner/assets/temporal_localization_2.mp4 rename to cookbooks/cosmos3/reasoner/basic_examples/assets/temporal_localization_2.mp4 diff --git a/cookbooks/cosmos3/reasoner/assets/video_caption.mp4 b/cookbooks/cosmos3/reasoner/basic_examples/assets/video_caption.mp4 similarity index 100% rename from cookbooks/cosmos3/reasoner/assets/video_caption.mp4 rename to cookbooks/cosmos3/reasoner/basic_examples/assets/video_caption.mp4 diff --git a/cookbooks/cosmos3/reasoner/reasoner_prompt_guide.md b/cookbooks/cosmos3/reasoner/basic_examples/reasoner_prompt_guide.md similarity index 100% rename from cookbooks/cosmos3/reasoner/reasoner_prompt_guide.md rename to cookbooks/cosmos3/reasoner/basic_examples/reasoner_prompt_guide.md diff --git a/cookbooks/cosmos3/reasoner/run_with_cosmos_framework.ipynb b/cookbooks/cosmos3/reasoner/basic_examples/run_with_cosmos_framework.ipynb similarity index 100% rename from cookbooks/cosmos3/reasoner/run_with_cosmos_framework.ipynb rename to cookbooks/cosmos3/reasoner/basic_examples/run_with_cosmos_framework.ipynb diff --git a/cookbooks/cosmos3/reasoner/run_with_nim.ipynb b/cookbooks/cosmos3/reasoner/basic_examples/run_with_nim.ipynb similarity index 100% rename from cookbooks/cosmos3/reasoner/run_with_nim.ipynb rename to cookbooks/cosmos3/reasoner/basic_examples/run_with_nim.ipynb diff --git a/cookbooks/cosmos3/reasoner/run_with_vllm.ipynb b/cookbooks/cosmos3/reasoner/basic_examples/run_with_vllm.ipynb similarity index 100% rename from cookbooks/cosmos3/reasoner/run_with_vllm.ipynb rename to cookbooks/cosmos3/reasoner/basic_examples/run_with_vllm.ipynb From 669ba5c53c1990f9a37cff2e246aa04a93d138dd Mon Sep 17 00:00:00 2001 From: paularamo Date: Mon, 15 Jun 2026 14:27:34 -0400 Subject: [PATCH 3/4] Update README.md and cookbooks setup README for basic_examples/ paths Fix all notebook links, nbviewer URLs, and asset paths in the main README.md Examples table (11 entries) and the NIM curl example to reference the new basic_examples/ subdirectories. Also update the cookbooks/cosmos3/README.md vLLM server section to point at the relocated run_with_vllm.ipynb. --- README.md | 24 ++++++++++++------------ cookbooks/cosmos3/README.md | 4 ++-- 2 files changed, 14 insertions(+), 14 deletions(-) diff --git a/README.md b/README.md index d54eaa07..a788f1a0 100644 --- a/README.md +++ b/README.md @@ -522,7 +522,7 @@ docker run -it --rm --name=$CONTAINER_NAME \ The OpenAI-compatible API is then available at `http://127.0.0.1:8000/v1`. Query it with `curl`: ```shell -IMAGE_DATA_URI="data:image/jpeg;base64,$(base64 -w 0 cookbooks/cosmos3/reasoner/assets/robot_153.jpg)" +IMAGE_DATA_URI="data:image/jpeg;base64,$(base64 -w 0 cookbooks/cosmos3/reasoner/basic_examples/assets/robot_153.jpg)" curl -X POST 'http://127.0.0.1:8000/v1/chat/completions' \ -H 'Accept: application/json' \ @@ -631,17 +631,17 @@ We are building examples that show Cosmos 3 capabilities end to end, including w | Example | Surface | Workflows demonstrated | Open | nbviewer | | --- | --- | --- | --- | --- | -| Generator (audiovisual) with Diffusers | Generator | Text-to-image, plus text-to-video and image-to-video each with or without synchronized sound, via `Cosmos3OmniPipeline`. | [Notebook](cookbooks/cosmos3/generator/audiovisual/run_with_diffusers.ipynb) | [![Render with nbviewer](https://raw.githubusercontent.com/jupyter/design/master/logos/Badges/nbviewer_badge.svg)](https://nbviewer.org/github/nvidia/cosmos/blob/main/cookbooks/cosmos3/generator/audiovisual/run_with_diffusers.ipynb) | -| Generator (audiovisual) with Cosmos Framework | Generator | Text-to-image, plus text-to-video and image-to-video each with sound on or off, through the `cosmos_framework.scripts.inference` entrypoint. | [Notebook](cookbooks/cosmos3/generator/audiovisual/run_with_cosmos_framework.ipynb) | [![Render with nbviewer](https://raw.githubusercontent.com/jupyter/design/master/logos/Badges/nbviewer_badge.svg)](https://nbviewer.org/github/nvidia/cosmos/blob/main/cookbooks/cosmos3/generator/audiovisual/run_with_cosmos_framework.ipynb) | -| Generator (audiovisual) with vLLM-Omni | Generator | Text-to-image, plus text-to-video and image-to-video each with sound on or off, against an OpenAI-compatible vLLM-Omni server. | [Notebook](cookbooks/cosmos3/generator/audiovisual/run_with_vllm_omni.ipynb) | [![Render with nbviewer](https://raw.githubusercontent.com/jupyter/design/master/logos/Badges/nbviewer_badge.svg)](https://nbviewer.org/github/nvidia/cosmos/blob/main/cookbooks/cosmos3/generator/audiovisual/run_with_vllm_omni.ipynb) | -| Forward dynamics with Cosmos Framework | Generator | Forward dynamics: action-conditioned future-observation prediction for AV, DROID, and UMI, through the `cosmos_framework.scripts.inference` entrypoint. | [Notebook](cookbooks/cosmos3/generator/action/run_fd_with_cosmos_framework.ipynb) | [![Render with nbviewer](https://raw.githubusercontent.com/jupyter/design/master/logos/Badges/nbviewer_badge.svg)](https://nbviewer.org/github/nvidia/cosmos/blob/main/cookbooks/cosmos3/generator/action/run_fd_with_cosmos_framework.ipynb) | -| Forward dynamics with vLLM-Omni | Generator | Forward dynamics: action-conditioned future-observation prediction for AV, DROID, and UMI, against an OpenAI-compatible vLLM-Omni server. | [Notebook](cookbooks/cosmos3/generator/action/run_fd_with_vllm.ipynb) | [![Render with nbviewer](https://raw.githubusercontent.com/jupyter/design/master/logos/Badges/nbviewer_badge.svg)](https://nbviewer.org/github/nvidia/cosmos/blob/main/cookbooks/cosmos3/generator/action/run_fd_with_vllm.ipynb) | -| Inverse dynamics with Cosmos Framework | Generator | Inverse dynamics: ego-motion trajectory prediction from input AV video, through the `cosmos_framework.scripts.inference` entrypoint. | [Notebook](cookbooks/cosmos3/generator/action/run_id_with_cosmos_framework.ipynb) | [![Render with nbviewer](https://raw.githubusercontent.com/jupyter/design/master/logos/Badges/nbviewer_badge.svg)](https://nbviewer.org/github/nvidia/cosmos/blob/main/cookbooks/cosmos3/generator/action/run_id_with_cosmos_framework.ipynb) | -| Inverse dynamics with vLLM-Omni | Generator | Inverse dynamics: ego-motion trajectory prediction from input AV video, against an OpenAI-compatible vLLM-Omni server. | [Notebook](cookbooks/cosmos3/generator/action/run_id_with_vllm.ipynb) | [![Render with nbviewer](https://raw.githubusercontent.com/jupyter/design/master/logos/Badges/nbviewer_badge.svg)](https://nbviewer.org/github/nvidia/cosmos/blob/main/cookbooks/cosmos3/generator/action/run_id_with_vllm.ipynb) | -| Transfer with Cosmos Framework | Generator | Video transfer: edge, blur, depth, segmentation, and world-scenario controls with captions, through the `cosmos_framework.scripts.inference` entrypoint. | [Notebook](cookbooks/cosmos3/generator/transfer/run_video_transfer_with_cosmos_framework.ipynb) | [![Render with nbviewer](https://raw.githubusercontent.com/jupyter/design/master/logos/Badges/nbviewer_badge.svg)](https://nbviewer.org/github/nvidia/cosmos/blob/main/cookbooks/cosmos3/generator/transfer/run_video_transfer_with_cosmos_framework.ipynb) | -| Reasoner with Cosmos Framework | Reasoner | Text and image reasoning: detailed captioning, robot task planning, 2D grounding, describe-anything, and action-trajectory prompts, through the `cosmos_framework.scripts.inference` entrypoint. | [Notebook](cookbooks/cosmos3/reasoner/run_with_cosmos_framework.ipynb) | [![Render with nbviewer](https://raw.githubusercontent.com/jupyter/design/master/logos/Badges/nbviewer_badge.svg)](https://nbviewer.org/github/nvidia/cosmos/blob/main/cookbooks/cosmos3/reasoner/run_with_cosmos_framework.ipynb) | -| Reasoner with vLLM | Reasoner | Image and video reasoning: captioning, temporal localization, embodied reasoning, common-sense reasoning, 2D grounding, describe-anything, action CoT, driving scenes, physical-plausibility, and situation understanding, against an OpenAI-compatible vLLM server (Cosmos3-Super on 4 GPUs by default; switch to Nano per the cookbook README). | [Notebook](cookbooks/cosmos3/reasoner/run_with_vllm.ipynb) | [![Render with nbviewer](https://raw.githubusercontent.com/jupyter/design/master/logos/Badges/nbviewer_badge.svg)](https://nbviewer.org/github/nvidia/cosmos/blob/main/cookbooks/cosmos3/reasoner/run_with_vllm.ipynb) | -| Reasoner with NIM | Reasoner | The same image and video reasoning examples as the vLLM notebook, run against the prebuilt, OpenAI-compatible [Cosmos 3 Reasoner NIM](https://catalog.ngc.nvidia.com/orgs/nim/teams/nvidia/containers/cosmos3-reasoner) container; local media is sent as base64 data URIs. | [Notebook](cookbooks/cosmos3/reasoner/run_with_nim.ipynb) | [![Render with nbviewer](https://raw.githubusercontent.com/jupyter/design/master/logos/Badges/nbviewer_badge.svg)](https://nbviewer.org/github/nvidia/cosmos/blob/main/cookbooks/cosmos3/reasoner/run_with_nim.ipynb) | +| Generator (audiovisual) with Diffusers | Generator | Text-to-image, plus text-to-video and image-to-video each with or without synchronized sound, via `Cosmos3OmniPipeline`. | [Notebook](cookbooks/cosmos3/generator/audiovisual/basic_examples/run_with_diffusers.ipynb) | [![Render with nbviewer](https://raw.githubusercontent.com/jupyter/design/master/logos/Badges/nbviewer_badge.svg)](https://nbviewer.org/github/nvidia/cosmos/blob/main/cookbooks/cosmos3/generator/audiovisual/basic_examples/run_with_diffusers.ipynb) | +| Generator (audiovisual) with Cosmos Framework | Generator | Text-to-image, plus text-to-video and image-to-video each with sound on or off, through the `cosmos_framework.scripts.inference` entrypoint. | [Notebook](cookbooks/cosmos3/generator/audiovisual/basic_examples/run_with_cosmos_framework.ipynb) | [![Render with nbviewer](https://raw.githubusercontent.com/jupyter/design/master/logos/Badges/nbviewer_badge.svg)](https://nbviewer.org/github/nvidia/cosmos/blob/main/cookbooks/cosmos3/generator/audiovisual/basic_examples/run_with_cosmos_framework.ipynb) | +| Generator (audiovisual) with vLLM-Omni | Generator | Text-to-image, plus text-to-video and image-to-video each with sound on or off, against an OpenAI-compatible vLLM-Omni server. | [Notebook](cookbooks/cosmos3/generator/audiovisual/basic_examples/run_with_vllm_omni.ipynb) | [![Render with nbviewer](https://raw.githubusercontent.com/jupyter/design/master/logos/Badges/nbviewer_badge.svg)](https://nbviewer.org/github/nvidia/cosmos/blob/main/cookbooks/cosmos3/generator/audiovisual/basic_examples/run_with_vllm_omni.ipynb) | +| Forward dynamics with Cosmos Framework | Generator | Forward dynamics: action-conditioned future-observation prediction for AV, DROID, and UMI, through the `cosmos_framework.scripts.inference` entrypoint. | [Notebook](cookbooks/cosmos3/generator/action/basic_examples/run_fd_with_cosmos_framework.ipynb) | [![Render with nbviewer](https://raw.githubusercontent.com/jupyter/design/master/logos/Badges/nbviewer_badge.svg)](https://nbviewer.org/github/nvidia/cosmos/blob/main/cookbooks/cosmos3/generator/action/basic_examples/run_fd_with_cosmos_framework.ipynb) | +| Forward dynamics with vLLM-Omni | Generator | Forward dynamics: action-conditioned future-observation prediction for AV, DROID, and UMI, against an OpenAI-compatible vLLM-Omni server. | [Notebook](cookbooks/cosmos3/generator/action/basic_examples/run_fd_with_vllm.ipynb) | [![Render with nbviewer](https://raw.githubusercontent.com/jupyter/design/master/logos/Badges/nbviewer_badge.svg)](https://nbviewer.org/github/nvidia/cosmos/blob/main/cookbooks/cosmos3/generator/action/basic_examples/run_fd_with_vllm.ipynb) | +| Inverse dynamics with Cosmos Framework | Generator | Inverse dynamics: ego-motion trajectory prediction from input AV video, through the `cosmos_framework.scripts.inference` entrypoint. | [Notebook](cookbooks/cosmos3/generator/action/basic_examples/run_id_with_cosmos_framework.ipynb) | [![Render with nbviewer](https://raw.githubusercontent.com/jupyter/design/master/logos/Badges/nbviewer_badge.svg)](https://nbviewer.org/github/nvidia/cosmos/blob/main/cookbooks/cosmos3/generator/action/basic_examples/run_id_with_cosmos_framework.ipynb) | +| Inverse dynamics with vLLM-Omni | Generator | Inverse dynamics: ego-motion trajectory prediction from input AV video, against an OpenAI-compatible vLLM-Omni server. | [Notebook](cookbooks/cosmos3/generator/action/basic_examples/run_id_with_vllm.ipynb) | [![Render with nbviewer](https://raw.githubusercontent.com/jupyter/design/master/logos/Badges/nbviewer_badge.svg)](https://nbviewer.org/github/nvidia/cosmos/blob/main/cookbooks/cosmos3/generator/action/basic_examples/run_id_with_vllm.ipynb) | +| Transfer with Cosmos Framework | Generator | Video transfer: edge, blur, depth, segmentation, and world-scenario controls with captions, through the `cosmos_framework.scripts.inference` entrypoint. | [Notebook](cookbooks/cosmos3/generator/transfer/basic_examples/run_video_transfer_with_cosmos_framework.ipynb) | [![Render with nbviewer](https://raw.githubusercontent.com/jupyter/design/master/logos/Badges/nbviewer_badge.svg)](https://nbviewer.org/github/nvidia/cosmos/blob/main/cookbooks/cosmos3/generator/transfer/basic_examples/run_video_transfer_with_cosmos_framework.ipynb) | +| Reasoner with Cosmos Framework | Reasoner | Text and image reasoning: detailed captioning, robot task planning, 2D grounding, describe-anything, and action-trajectory prompts, through the `cosmos_framework.scripts.inference` entrypoint. | [Notebook](cookbooks/cosmos3/reasoner/basic_examples/run_with_cosmos_framework.ipynb) | [![Render with nbviewer](https://raw.githubusercontent.com/jupyter/design/master/logos/Badges/nbviewer_badge.svg)](https://nbviewer.org/github/nvidia/cosmos/blob/main/cookbooks/cosmos3/reasoner/basic_examples/run_with_cosmos_framework.ipynb) | +| Reasoner with vLLM | Reasoner | Image and video reasoning: captioning, temporal localization, embodied reasoning, common-sense reasoning, 2D grounding, describe-anything, action CoT, driving scenes, physical-plausibility, and situation understanding, against an OpenAI-compatible vLLM server (Cosmos3-Super on 4 GPUs by default; switch to Nano per the cookbook README). | [Notebook](cookbooks/cosmos3/reasoner/basic_examples/run_with_vllm.ipynb) | [![Render with nbviewer](https://raw.githubusercontent.com/jupyter/design/master/logos/Badges/nbviewer_badge.svg)](https://nbviewer.org/github/nvidia/cosmos/blob/main/cookbooks/cosmos3/reasoner/basic_examples/run_with_vllm.ipynb) | +| Reasoner with NIM | Reasoner | The same image and video reasoning examples as the vLLM notebook, run against the prebuilt, OpenAI-compatible [Cosmos 3 Reasoner NIM](https://catalog.ngc.nvidia.com/orgs/nim/teams/nvidia/containers/cosmos3-reasoner) container; local media is sent as base64 data URIs. | [Notebook](cookbooks/cosmos3/reasoner/basic_examples/run_with_nim.ipynb) | [![Render with nbviewer](https://raw.githubusercontent.com/jupyter/design/master/logos/Badges/nbviewer_badge.svg)](https://nbviewer.org/github/nvidia/cosmos/blob/main/cookbooks/cosmos3/reasoner/basic_examples/run_with_nim.ipynb) | ### Inference Benchmarks diff --git a/cookbooks/cosmos3/README.md b/cookbooks/cosmos3/README.md index ecf78a06..48f6474c 100644 --- a/cookbooks/cosmos3/README.md +++ b/cookbooks/cosmos3/README.md @@ -203,7 +203,7 @@ export VLLM_USE_DEEP_GEMM=0 All Reasoner cookbooks talk to an OpenAI-compatible chat-completions API. After [installing vLLM](#vllm), run the commands below from `cookbooks/cosmos3/reasoner` (same working directory as -[`run_with_vllm.ipynb`](reasoner/run_with_vllm.ipynb)). That sets +[`run_with_vllm.ipynb`](reasoner/basic_examples/run_with_vllm.ipynb)). That sets `$(dirname "$(pwd)")` to `/cookbooks/cosmos3`, which matches the notebook's `COSMOS3_MEDIA_ROOT`. @@ -221,7 +221,7 @@ vllm serve nvidia/Cosmos3-Nano \ --port 8000 ``` -**Cosmos3-Super** (four GPUs; default in [`run_with_vllm.ipynb`](reasoner/run_with_vllm.ipynb), port 8001): +**Cosmos3-Super** (four GPUs; default in [`run_with_vllm.ipynb`](reasoner/basic_examples/run_with_vllm.ipynb), port 8001): ```bash export COSMOS3_MEDIA_ROOT="$(dirname "$(pwd)")" From 271f6c5338134c226a78367674938a8174a93819 Mon Sep 17 00:00:00 2001 From: paularamo Date: Tue, 16 Jun 2026 14:16:36 -0400 Subject: [PATCH 4/4] Add author attribution requirement to contributing guidelines Adds "6. Author Attribution" to the cookbook quality requirements, requiring all recipes to include author name(s) and organization in both the README and notebook first cell. Updates the README template with the author block format. --- CONTRIBUTING.md | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 48c67c1b..4e9d9cf5 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -169,6 +169,22 @@ Examples: For markdown-only guides (no notebook): `run__with_.md` +### 6. Author Attribution + +Every cookbook must credit its authors to increase visibility and recognition: + +- **README:** Include an author block immediately after the title (see [README template](#cookbook-readme-template)) +- **Notebook:** Include an author block in the first markdown cell, right after the SPDX header and title + +Use this format: + +```markdown +> **Authors:** [Full Name](https://linkedin.com/in/handle), [Full Name](https://linkedin.com/in/handle) +> **Organization:** [Your Organization](https://your-org.com/) +``` + +This is required for all new contributions and encouraged for existing cookbooks. + --- ## Cookbook README Template @@ -178,6 +194,9 @@ Each cookbook directory needs a `README.md`. Use this structure: ```markdown # [Cookbook Title] +> **Authors:** [Your Name](https://linkedin.com/in/your-handle) +> **Organization:** [Your Organization](https://your-org.com/) + One-paragraph description of what this cookbook demonstrates and why it matters. ## What You'll Build