Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .agents/scripts/api-discovery/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Per-developer override of <workspace-root> for the api-discovery
# extraction cache. Contains an absolute path; do not commit.
.workspace-root
158 changes: 158 additions & 0 deletions .agents/scripts/api-discovery/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,158 @@
# `api-discovery` scripts

Resolve the on-disk location of a Maven artifact's source code for
agents and developers, without repeatedly `unzip`-ing JARs out of the
Gradle cache.

The agent-facing documentation lives in
[`../../skills/api-discovery/SKILL.md`](../../skills/api-discovery/SKILL.md);
this file is the implementation reference.

## Why

Agents investigating library APIs used to run dozens of `find
~/.gradle/caches` + `unzip -l` + `unzip -p` calls per question. Each
`unzip` decompresses the archive from scratch — slow and token-heavy.

This package replaces that pattern with two cheap reads:

1. **Sibling-first** — every Spine artifact maps to a sibling clone
under `<workspace-root>/<repo>/`. The source tree is already on
disk; we just resolve the right submodule path.
2. **Extraction cache** — non-Spine artifacts (Jackson, Guava, etc.)
have their `-sources.jar` extracted **once** to a per-workstation
cache. Subsequent queries return instantly.

## Layout

```
.agents/scripts/api-discovery/
├── README.md # this file
├── lib/common.sh # shared bash helpers
├── discover # main entry — resolve a coordinate to a path
├── extract-sources # one-shot JAR extraction (race-safe)
├── update-sibling # `git pull --ff-only` a stale sibling (safe-guarded)
└── clean-cache # prune the extraction cache
```

The cache itself is **not** under the repo. It lives at:

```
<workspace-root>/.agents/caches/api-discovery/sources/<group>/<artifact>/<version>/
```

`<workspace-root>` defaults to the parent of the consumer repo (e.g.
`/Users/<you>/Projects/Spine/` when the consumer repo is
`.../Spine/config/`). To override, write the absolute path to an
alternative root into `.workspace-root` next to this README (the
script is gitignored).

## Bootstrap

First-time use needs the cache directory created. The scripts detect
its absence and exit `10`; the skill instructs the agent to ask the
user whether to:

1. **Approve** the default path,
2. **Provide an alternative** workspace root,
3. **Disable** the cache for this repo (recorded in per-developer
auto-memory; sibling-first resolution still works).

## Scripts

### `discover`

```
discover <group>:<artifact>:<version>
discover <group>:<artifact> # version pulled from buildSrc
discover <artifact> # Spine-only; group + version inferred
```

- **stdout** — absolute path to a directory you can `Grep`/`Read`.
- **stderr** — `STALE` warnings when the sibling's `versionToPublish`
differs from the declared dependency version, plus other
diagnostics. Always inspect.
- **exit 0** — path resolved (even if stale; the warning is on stderr).
- **exit 1** — unresolvable (missing sibling AND no sources JAR).
- **exit 10** — cache uninitialized; run the bootstrap flow.

### `extract-sources`

```
extract-sources <group>:<artifact>:<version>
```

Idempotent and race-safe. If the target directory is already populated
the script returns its path immediately. Concurrent first-time
extractions race on an atomic `mv` of a per-PID temp directory; the
loser discards its temp.

### `update-sibling`

```
update-sibling <sibling-name> # resolved under <workspace-root>
update-sibling <absolute-path> # acts on that path directly
```

Invoked by the agent (after user consent) when `discover` emits a
`STALE` warning. Safe by design:

- Pulls **only** when the sibling is on `master` or `main` with a
clean working tree and a tracked upstream.
- On any other branch, exits `0` without touching anything — the
user's "advancing multiple subprojects at once" workflow keeps
feature branches checked out as intentional staging state.
- Refuses on detached HEAD (`3`), uncommitted changes (`4`), or
missing upstream (`5`).
- Never switches branches, never `--rebase`, never `--force`.

On success (exit `0`), the script writes a single stable token to
**stdout** that names the outcome — callers should branch on the
token, not on stderr text. Failure paths produce empty stdout.

Exit codes:

| Code | stdout | Meaning |
|---|---|---|
| `0` | `pulled` | HEAD advanced to upstream tip |
| `0` | `up-to-date` | Already at upstream tip; nothing to do |
| `0` | `skipped-branch` | On a non-default branch; left untouched |
| `1` | _(empty)_ | Sibling not on disk |
| `2` | _(empty)_ | Not a git repository |
| `3` | _(empty)_ | Detached HEAD — refused |
| `4` | _(empty)_ | Working tree dirty — refused |
| `5` | _(empty)_ | No upstream tracking on default branch — refused |
| `6` | _(empty)_ | `git pull --ff-only` failed (divergence, network, etc.) |
| `64` | _(empty)_ | Usage error (no/too many arguments) — BSD `sysexits(3)` convention |

### `clean-cache`

```
clean-cache --dry-run
clean-cache --older-than 30d [--dry-run]
clean-cache --all [--dry-run]
```

Manual pruning only. Nothing runs on a timer.

## Conventions

- **Bash 3.2 compatible** — macOS ships 3.2 by default.
- **No external dependencies** beyond `bash`, coreutils, `grep`,
`sed`, `awk`, `unzip`, `find`, and `git` (used only by
`update-sibling`).
- **stdout** is always the answer; **stderr** is diagnostics. Mix
them only by piping.
- Scripts source `lib/common.sh` after setting
`SPINE_API_DISCOVERY_DIR`, so the workspace-root pointer file is
reachable.

## Troubleshooting

| Symptom | Likely cause | Fix |
|---|---|---|
| `cache not initialized` (exit 10) | Bootstrap not run | Follow the skill's bootstrap prompt |
| `sibling not on disk` | Spine repo not cloned | `git clone` it next to your consumer repo |
| `STALE: ...` | Sibling drifted from declared version | Run `update-sibling <path>` (auto-skips feature branches), or accept the warning |
| `is in the Gradle cache but publishes no -sources.jar` | Upstream artifact has no sources | Read the binary `.class` files via a different tool, or look at GitHub directly |
| `is not in the local Gradle cache` | Gradle has not fetched the dep | `./gradlew dependencies` to populate, then retry |
143 changes: 143 additions & 0 deletions .agents/scripts/api-discovery/clean-cache
Original file line number Diff line number Diff line change
@@ -0,0 +1,143 @@
#!/usr/bin/env bash
#
# Prune the workstation api-discovery extraction cache.
#
# Usage:
# clean-cache --dry-run # list what would be removed (no deletes)
# clean-cache --older-than 30d # remove entries older than 30 days
# clean-cache --older-than 7d --dry-run
# clean-cache --all # wipe the whole sources cache
# clean-cache --all --dry-run
#
# The cache is at
# `<workspace-root>/.agents/caches/api-discovery/sources/<group>/<artifact>/<version>/`.
# "Age" is the directory's mtime (recorded at extraction time).
#
# Exit codes (see lib/common.sh):
# 0 — succeeded (with or without removals).
# 1 — bad arguments or filesystem error.
# 10 — cache not initialized (nothing to clean).
set -euo pipefail

SCRIPT_DIR="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" && pwd)"
SPINE_API_DISCOVERY_DIR="$SCRIPT_DIR"
export SPINE_API_DISCOVERY_DIR
# shellcheck source=lib/common.sh
. "$SCRIPT_DIR/lib/common.sh"

usage() {
cat >&2 <<'EOF'
Usage:
clean-cache --dry-run
clean-cache --older-than <DURATION> [--dry-run]
clean-cache --all [--dry-run]

DURATION is a `find -mtime`-style suffix: `7d`, `30d`, `90d` (days only).
EOF
exit "$EX_FAIL"
}

mode=""
days=""
dry_run=0

while [ "$#" -gt 0 ]; do
case "$1" in
--dry-run)
dry_run=1
shift
;;
--all)
mode="all"
shift
;;
--older-than)
mode="older-than"
shift
[ "$#" -gt 0 ] || usage
case "$1" in
*d) days="${1%d}" ;;
*) log_warn "duration must end in 'd' (days), got: $1"; usage ;;
esac
case "$days" in
''|*[!0-9]*) log_warn "duration days must be numeric, got: $1"; usage ;;
esac
shift
;;
-h|--help)
usage
;;
*)
log_warn "unknown argument: $1"
usage
;;
esac
done

# Default to a no-op listing if no mode was given.
[ -n "$mode" ] || mode="older-than-default"
Comment thread
alexander-yevsyukov marked this conversation as resolved.

if ! cache_initialized; then
log_warn "cache not initialized: $(cache_root)"
exit "$EX_NO_CACHE"
fi

sources="$(sources_root)"

# Collect targets into a temp list so we can preview and act consistently.
list_file="$(mktemp -t api-discovery-clean.XXXXXX)"
trap 'rm -f -- "$list_file"' EXIT

case "$mode" in
all)
find "$sources" -mindepth 3 -maxdepth 3 -type d -print > "$list_file" 2>/dev/null || true
;;
older-than)
find "$sources" -mindepth 3 -maxdepth 3 -type d -mtime "+$days" -print \
> "$list_file" 2>/dev/null || true
;;
older-than-default)
log_warn "no mode specified; use --all or --older-than <N>d"
usage
Comment thread
alexander-yevsyukov marked this conversation as resolved.
;;
esac

count="$(wc -l < "$list_file" | tr -d '[:space:]')"

# Prune now-empty `<group>/` and `<group>/<artifact>/` parents so the cache
# layout stays tidy. Two passes (artifact dirs first, then group dirs) so
# that emptying a group's last artifact also reclaims the group dir.
# Skipped under --dry-run so the command stays read-only.
prune_empty_parents() {
find "$sources" -mindepth 2 -maxdepth 2 -type d -empty -exec rmdir -- {} + \
2>/dev/null || true
find "$sources" -mindepth 1 -maxdepth 1 -type d -empty -exec rmdir -- {} + \
2>/dev/null || true
}

if [ "$count" -eq 0 ]; then
log_warn "no entries match; cache untouched"
[ "$dry_run" -eq 0 ] && prune_empty_parents
exit "$EX_OK"
fi

if [ "$dry_run" -eq 1 ]; then
log_warn "would remove $count entr$( [ "$count" -eq 1 ] && printf 'y' || printf 'ies' ):"
cat -- "$list_file"
exit "$EX_OK"
fi

removed=0
while IFS= read -r path; do
[ -n "$path" ] || continue
if rm -rf -- "$path"; then
removed=$((removed + 1))
else
log_warn "failed to remove: $path"
fi
done < "$list_file"

prune_empty_parents

log_warn "removed $removed entr$( [ "$removed" -eq 1 ] && printf 'y' || printf 'ies' )"
exit "$EX_OK"
Loading
Loading