daytonaio · goodgoodclaw · May 31, 2026
diff --git a/articles/20260531_run_ai_engineer_rehearsals_in_daytona.md b/articles/20260531_run_ai_engineer_rehearsals_in_daytona.md
@@ -0,0 +1,241 @@
+---
+title: 'Run AI Engineer Rehearsals in Daytona'
+description: 'Use Daytona workspaces to run Omni Engineer and Claude Engineer as a reproducible pair-review loop for dependency upgrades.'
+date: 2026-05-31
+author: 'Goodgood Claw'
+tags: ['daytona', 'ai engineering', 'dev containers']
+---
+
+# Run AI Engineer Rehearsals in Daytona
+
+# Introduction
+
+AI coding assistants are most useful when their work happens inside an
+environment that can be recreated, inspected, and deleted. A chat transcript can
+suggest a fix, but a repository branch, a passing test run, and a written risk
+note are the parts a maintainer can actually review. Daytona is a good fit for
+that style of work because it turns a repository configuration into a clean
+workspace instead of relying on the developer's laptop setup.
+
+This article shows a practical workflow for running
+[Omni Engineer](https://github.com/Doriandarko/omni-engineer) and
+[Claude Engineer](https://github.com/Doriandarko/claude-engineer) inside Daytona.
+The example is a [dependency upgrade rehearsal](../definitions/20260531_definition_dependency_upgrade_rehearsal.md):
+a disposable dry run where one assistant maps the change, the other challenges
+the plan, and the developer turns the useful parts into a small pull request.
+
+![Daytona AI engineer rehearsal workflow](assets/20260531_run_ai_engineer_rehearsals_in_daytona_img1.svg)
+
+## TL;DR
+
+- Put Omni Engineer and Claude Engineer in separate Daytona workspaces so each
+  assistant starts from a clean, reproducible environment.
+- Store `OPENROUTER_API_KEY`, `ANTHROPIC_API_KEY`, and optional tool keys as
+  environment variables, not committed files.
+- Use Omni Engineer for repository mapping and upgrade planning.
+- Use Claude Engineer for second-pass review, regression checklists, and web or
+  CLI follow-up.
+- Treat AI output as advisory. The artifact to trust is the branch, tests, and
+  pull request notes created inside Daytona.
+
+## Why Run AI Engineers in Daytona?
+
+Local AI-assisted development often drifts into a messy state. One terminal has
+an activated virtual environment, another has a stale dependency cache, and a
+third contains exported API keys from yesterday's experiment. That is fine for
+quick exploration, but it becomes hard to review when the work needs to be
+shared.
+
+Daytona gives the assistant a tighter boundary. The workspace starts from the
+repository, installs declared dependencies, and exposes only the environment
+variables you choose to provide. If the assistant suggests a package bump, you
+can test it in a branch and throw the workspace away afterward. If another
+reviewer wants to repeat the same run, they can create the same workspace
+instead of reconstructing your machine.
+
+This separation matters even more when you use more than one assistant. Omni
+Engineer and Claude Engineer have different interfaces and strengths. Omni
+Engineer is a lightweight console around OpenRouter models, useful for quick
+mapping, search, and file-context conversations. Claude Engineer provides a CLI
+and Flask web interface around Anthropic's API, with tool execution and
+self-improvement features. In Daytona, you can let them cross-check each other
+without mixing credentials or generated files.
+
+## Prepare the Workspaces
+
+Start by adding the credentials to Daytona's environment store. Use the keys
+that match your providers and leave optional tools empty until you need them.
+
+```bash
+daytona env set OPENROUTER_API_KEY=your_openrouter_key
+daytona env set ANTHROPIC_API_KEY=your_anthropic_key
+daytona env set E2B_API_KEY=your_e2b_key
+```
+
+Omni Engineer reads `OPENROUTER_API_KEY`. Claude Engineer reads
+`ANTHROPIC_API_KEY`, and its E2B-powered code execution tool can use
+`E2B_API_KEY` when enabled. Keeping those values in the workspace environment is
+safer than creating a `.env` file that might be accidentally committed.
+
+Next, create one workspace for each project:
+
+```bash
+daytona create https://github.com/Doriandarko/omni-engineer
+daytona create https://github.com/Doriandarko/claude-engineer
+```
+
+The cleanest setup is to keep a `.devcontainer/devcontainer.json` in each
+repository. For Omni Engineer, the Dev Container only needs Python, Git, the
+existing `requirements.txt`, and the OpenRouter key. For Claude Engineer, it
+also forwards port `5000` so the Flask web UI can open from the workspace.
+
+The important part is not the exact image name. It is that the install command,
+Python version, forwarded ports, and environment variables live in source
+control. Daytona can then create the same assistant workspace every time.
+
+## Split the Assistants by Job
+
+A dependency upgrade rehearsal works best when the assistants have different
+responsibilities. Do not ask both tools to make broad changes at the same time.
+That produces overlapping edits and makes it harder to see which suggestion was
+useful.
+
+Use Omni Engineer first as the mapper. In the target repository, ask it to
+summarize:
+
+- the package manager and lockfile in use
+- the dependency you want to upgrade
+- code paths that import or configure that dependency
+- tests that already cover those paths
+- likely migration notes from the changelog
+
+The output you want is not code yet. You want a small plan with files, commands,
+and risk areas. For example, a useful Omni Engineer result might say: "Upgrade
+the HTTP client, inspect middleware initialization, run the API tests, and add a
+regression case for timeout handling." That is specific enough to act on and
+small enough to review.
+
+Then use Claude Engineer as the challenger. Feed it the plan, the lockfile diff,
+and the first test result. Ask it what is missing, which assumptions are weak,
+and which regression test would prove the behavior. This second pass is where
+many AI-assisted changes improve. One assistant proposes the path; the other
+tries to find the sharp edges.
+
+## Run a Rehearsal Branch
+
+Create a branch in the project you actually want to upgrade:
+
+```bash
+git checkout -b rehearse-http-client-upgrade
+```
+
+Apply the smallest dependency change first. For Python, that might be a single
+`requirements.txt` edit. For Node.js, it may be a `package.json` and lockfile
+update. Avoid combining the upgrade with formatting, folder moves, or unrelated
+cleanup. The branch should answer one question: can this dependency move safely?
+
+Run the existing test command before changing source code:
+
+```bash
+python -m pytest
+```
+
+If the first run fails, paste the failing test name and stack trace into Claude
+Engineer. Ask for a diagnosis, not a patch. When the diagnosis points to a real
+breaking change, make the code edit yourself or review the assistant's proposed
+edit line by line before applying it.
+
+After the source code is fixed, add one regression test for the behavior that
+could break again. This is the part maintainers care about. A version bump with
+no test evidence asks them to trust the tool. A version bump with a focused
+regression test gives them something stable to review.
+
+## Keep Secrets and Context Out of Git
+
+AI engineer tools can make it tempting to paste everything into the prompt:
+environment values, private issue comments, service tokens, and entire terminal
+histories. Do not do that. A good Daytona workflow gives the assistant enough
+context to reason about the code while keeping private state outside the
+repository.
+
+Use these rules during the rehearsal:
+
+- Never commit `.env`, generated chat logs, or local prompt history.
+- Describe secrets by name, such as `ANTHROPIC_API_KEY`, instead of pasting the
+  secret value.
+- Share failing command output only after removing account IDs, tokens, and
+  private URLs.
+- Keep assistant-generated scratch files out of the final branch unless they are
+  part of the documented project.
+
+Daytona helps by giving the work a disposable home. When the rehearsal is done,
+you can keep the branch and delete the workspace. The repository history remains
+clean, and the next contributor can reproduce the setup from committed
+configuration.
+
+## Write the Pull Request Evidence
+
+The pull request should explain the upgrade in the same structure as the
+rehearsal:
+
+- what changed
+- why the dependency needed to move
+- which files were touched
+- which tests were run
+- what risk remains
+
+Here is a compact PR note format:
+
+```markdown
+## Summary
+- upgrade the HTTP client dependency
+- adjust timeout handling for the new client behavior
+- add a regression test for request cancellation
+
+## Validation
+- python -m pytest tests/api/test_timeouts.py
+- python -m pytest
+
+## Risk
+- low; the change is limited to API client setup and covered by regression tests
+```
+
+This is also where the two-assistant workflow pays off. Omni Engineer's mapping
+becomes the summary. Claude Engineer's challenge list becomes the risk section.
+Your actual test commands become the validation section. The assistant output is
+not copied blindly; it is distilled into evidence a maintainer can verify.
+
+## When to Use This Pattern
+
+This workflow is strongest for changes with a narrow blast radius:
+
+- dependency upgrades
+- SDK migrations
+- small framework configuration changes
+- test coverage for known compatibility issues
+- documentation updates backed by runnable examples
+
+It is weaker for large rewrites or ambiguous product decisions. In those cases,
+the assistant can still help with exploration, but the branch may not be small
+enough for a clean rehearsal. Daytona keeps the environment reproducible, but it
+does not replace engineering judgment about scope.
+
+## Conclusion
+
+Running Omni Engineer and Claude Engineer inside Daytona turns AI-assisted
+coding into a reviewable loop. The workspace gives the tools a clean boundary.
+The branch gives maintainers a concrete artifact. The tests decide whether the
+change works.
+
+That balance is the main benefit. You can move quickly with AI support while
+still producing the kind of evidence that belongs in a serious pull request:
+minimal diffs, repeatable setup, clear validation, and no leaked secrets.
+
+## References
+
+- [Omni Engineer on GitHub](https://github.com/Doriandarko/omni-engineer)
+- [Claude Engineer on GitHub](https://github.com/Doriandarko/claude-engineer)
+- [Daytona environment variables article](20241126_Using_Environmental_Variables_in_Daytona.md)
+- [Dev Containers specification](https://containers.dev/)
+- [Companion Omni Engineer Dev Container PR](https://github.com/Doriandarko/omni-engineer/pull/43)
+- [Companion Claude Engineer Dev Container PR](https://github.com/Doriandarko/claude-engineer/pull/267)
diff --git a/articles/assets/20260531_run_ai_engineer_rehearsals_in_daytona_img1.svg b/articles/assets/20260531_run_ai_engineer_rehearsals_in_daytona_img1.svg
diff --git a/authors/goodgood_claw.md b/authors/goodgood_claw.md
@@ -0,0 +1,23 @@
+Author: Goodgood Claw
+
+Title: Independent Open Source Contributor
+
+Description: Goodgood Claw is an independent open-source contributor focused on
+developer tooling, reproducible environments, and practical AI-assisted
+engineering workflows. They write about using automation without losing the
+review habits that keep software changes trustworthy.
+
+Author Image: [GitHub avatar](https://github.com/goodgoodclaw.png?size=512)
+
+Author LinkedIn:
+
+Author Twitter:
+
+Company Name: Independent Contributor
+
+Company Description: Independent software contributor focused on practical
+developer workflow writing and tooling.
+
+Company Logo Dark: N/A
+
+Company Logo White: N/A
diff --git a/definitions/20260531_definition_dependency_upgrade_rehearsal.md b/definitions/20260531_definition_dependency_upgrade_rehearsal.md
@@ -0,0 +1,32 @@
+---
+title: 'Dependency Upgrade Rehearsal'
+description: 'A disposable, repeatable dry run used to plan, test, and document a dependency upgrade before it reaches the main branch.'
+date: 2026-05-31
+author: 'Goodgood Claw'
+---
+
+# Dependency Upgrade Rehearsal
+
+## Definition
+
+A dependency upgrade rehearsal is a controlled dry run for changing one or more
+software dependencies before the change is proposed for the main branch. The
+developer uses a disposable environment, a temporary branch, and a clear test
+checklist to discover breaking changes, migration steps, security notes, and
+release risks early.
+
+## Context and Usage
+
+In a Daytona workspace, a dependency upgrade rehearsal is useful because the
+environment can be recreated from repository configuration instead of the
+developer's laptop state. A team can run the same package manager commands,
+execute the same tests, and compare results without manually rebuilding local
+toolchains.
+
+The rehearsal usually starts by reading the package changelog and lockfile diff,
+then applying the smallest possible upgrade. The developer records failing
+tests, fixes code paths affected by the new dependency, and adds regression
+coverage for behavior that changed. When AI assistants are used, their output
+should be treated as planning and review input. The durable artifact remains the
+branch, test results, and pull request notes produced inside the reproducible
+workspace.