Lightweight bootstrap installer for running the AIA agentic evaluation pipelines.
This repo contains only one file:
install.sh— downloads the launcher from its canonical upstream location and installs it into~/.local/bin/as theaia-evaluationcommand.
The launcher itself, plus the per-platform .pyz files it fetches, live
upstream. There is no duplication or sync.
-
Python 3.12 on
PATH—brew install python@3.12oruv python install 3.12. -
GitHub authentication for the JetBrains org — either an active
gh auth loginsession orGH_TOKENexported in your shell. Any PAT that grants you access to the JetBrains GitHub organization works; no special per-repo permissions are needed.Required both at install time (to pull the launcher script) and at runtime (to pull the
.pyz).
curl -fsSL https://raw.githubusercontent.com/dpaia/aia-evaluation/main/install.sh | bashCLI flags are forwarded directly to the upstream agentic pipeline:
aia-evaluation --lang java --runner JUNIE_ACP --debug
aia-evaluation --lang python --runner CLAUDE_CODE --debug-n-instances 1
aia-evaluation --lang csharp --runner GOLDEN --debugRequired flags (the launcher errors out fast otherwise):
--lang java|kotlin|python|go|csharp— must target a specific language;allis rejected.--runner <NAME>— seeaia-evaluation --helpfor the supported list.
If you don't pass --dataset-tag, the launcher injects --dataset-tag default_agent
automatically. Any value you pass is respected as-is.
- Verifies
python3.12is onPATH. - Verifies your GitHub auth (
gh auth statusorGH_TOKEN). - Detects your platform via
uname -smand composesaia-evaluation-<os>-<arch>.pyz. - Downloads the matching
.pyz(with progress bar) and a tinyVERSIONfile into~/.cache/aia-evaluation/. - Executes the
.pyz, which:- Validates
--lang/ injects--dataset-tag default_agentif needed. - Self-heals ZenML state (runs
zenml login jcp-prod/zenml project set ai-assistant/zenml stack set …for you, using the bundledzenmlCLI inside the.pyz). - Runs the actual agentic pipeline.
- Validates
Only the small VERSION file (~40 bytes) is fetched. If it matches the cached
copy, the cached .pyz is reused — no re-download. If a newer build is
available upstream, the launcher refreshes the .pyz transparently and shows
a progress bar.
Pre-built .pyz files exist for:
uname -sm |
Asset |
|---|---|
Linux x86_64 |
aia-evaluation-linux-x86_64.pyz |
Darwin arm64 |
aia-evaluation-darwin-arm64.pyz |
Other host platforms (Intel Mac, Linux ARM, Windows) aren't built. Running
aia-evaluation on those produces a clear "release does not contain asset
…" error. If you need another target, file a request upstream.
Re-run the install command — install.sh overwrites the launcher with the
latest copy from the upstream default branch. The .pyz itself is
auto-updated on every invocation when upstream publishes a new build (the
launcher compares the cached VERSION against the release's VERSION).
rm ~/.local/bin/aia-evaluation
rm -rf ~/.cache/aia-evaluationThe installer drops aia-evaluation into ~/.local/bin/. If that directory
isn't on your PATH, the script prints exactly what to add to your
~/.zshrc / ~/.bashrc:
export PATH="$HOME/.local/bin:$PATH"After sourcing your shell config (or opening a fresh terminal), confirm:
which aia-evaluation
# → /Users/<you>/.local/bin/aia-evaluationAIA_PREFIX=/usr/local/bin curl -fsSL https://raw.githubusercontent.com/dpaia/aia-evaluation/main/install.sh | bashThe default branch (launcher for now) can be overridden:
AIA_BRANCH=main curl -fsSL https://raw.githubusercontent.com/dpaia/aia-evaluation/main/install.sh | bashThe pipeline code lives in a JetBrains-internal repo, which is why anything
that touches it (the launcher source, the .pyz) needs GitHub auth against
the JetBrains org. This installer repo (dpaia/aia-evaluation) is intentionally
public so that anyone can fetch the bootstrap install.sh anonymously — past
the bootstrap step the same GH_TOKEN / gh session powers both install and
runtime.