Observable-only Autonomic Slack Gradient for local-first AI agent workflow optimization.
OASG is a local-first toolkit for long-running AI-agent workflows. It records what the agent can observe, reduces that history into operational state, proposes bounded workflow-policy changes, tests them through receipts, and promotes only changes that improve conservative operational viability without protected regression.
The project target is not a smarter model. The target is a more durable workflow:
keep running, learn from observable history, and improve operational capability without using an external evaluator as the improvement oracle.
OASG optimizes workflow policy only. It does not fine-tune model weights, does not use an LLM judge, and does not claim semantic truth. Deterministic validators, replay receipts, rollback receipts, resource counters, and ledger checks are ordinary observable channels.
- Wrap any local or remote model as an observation source without making that model trusted.
- Store agent activity as append-only JSONL ledgers with canonical hashes and prefix checks.
- Reduce long-running workflow history into operational debt, pressure, and viability receipts.
- Trial workflow-policy changes through shadow/lease ledgers before promotion.
- Run conservative local optimization loops that can reject, quarantine, roll back, or promote.
- Export JSON Schemas and conformance fixtures for ports in other languages.
Use OASG when you need durable workflow operation, auditability, and fail-closed self-improvement around an agent. Do not use it as a benchmark score, model trainer, LLM judge, sandbox, or semantic truth oracle.
If you have five minutes, start with
docs/quick_mental_model.md, then run the
examples/minimal_agent_integration example.
- Quick Mental Model
- Why This Is Different
- Current Status
- Quickstart
- Use OASG With Your Agent
- CLI Map
- Model Integration
- Rejection Guide
- Experiments and Evidence
- Development Checks
- Citation
- Project Layout
- License
OASG is like Git + unit tests + CI gate + rollback receipts for an AI agent workflow. Your agent keeps running in its normal framework. OASG records observable events, checks workflow debt and viability, trials policy changes, and promotes only changes with receipt-backed evidence.
Read the five-minute explanation:
docs/quick_mental_model.md.
OASG turns long-running AI agents into self-maintaining workflow systems that improve only from observable operational evidence, without LLM judges, external rewards, or model-weight updates.
Most agent-improvement systems optimize for answer quality, benchmark scores, human feedback, LLM-judge feedback, or externally supplied reward functions. OASG instead treats a long-running agent as an operational system whose future action capacity can expand or collapse.
The core object is not accuracy. It is a conservative partial-order vector over:
- viable future action classes;
- unresolved obligations;
- validation, parse, replay, rollback, and evidence debt;
- budget, queue, context, and maintenance pressure;
- protected semantic, taint, boundary, authority, and effect floors;
- shadow, lease, gate, promotion, quarantine, and rollback receipts.
The improvement loop is:
append-only JSONL observable history
-> canonical hashing and ledger-prefix verification
-> deterministic reducers
-> finite-chain slack/debt state
-> typed pressure vector and scheduler
-> bounded workflow-policy mutation batch
-> runner-produced shadow/lease trial ledgers
-> finite-horizon KLB_2 viability lower bound
-> sidecar positive evidence witnesses
-> no-meta dominance gate
-> safe_non_regression / safe_promotion / active_promoted / reject / quarantine
The Python package is the reference runtime. The portability contract is language-independent: canonical JSON bytes, SHA-256 domain hashes, JSONL ledgers, JSON receipts, JSON Schemas, and conformance fixtures.
Package version: 1.1.0.
This repository is a working reference implementation with a conservative trusted core and an experimental long-running validation suite. It is suitable for local experiments and controlled workflow-policy optimization. It should still be treated as an alpha system for production automation because the safe path intentionally rejects many cases until they have complete receipts.
Implemented:
- OASG-CJ-1 canonical JSON and SHA-256 domain hashing.
- Append-only JSONL ledger sealing, duplicate handling, prefix verification, and quarantine receipts.
- Deterministic reducers over finite-chain dimensions and protected debt.
- Bounded
KLB_2computation over 8 action classes and 73 trace classes. - Typed pressure vectors and persistent scheduler state.
- Mutator profiles, outcome memory, cooldown, and bounded workflow-policy mutation batches.
- Structured workflow policy state and mutation patches.
- Runner-backed shadow and lease receipt paths.
ledger-replay, explicit shell-freelocal-command, and demo-onlydemo-replayrunner modes.- Positive evidence witnesses bound to ledger prefixes, comparison contracts, workload manifests, KLB receipts, and trial receipts.
- No-meta dominance gate with
safe_non_regression,safe_promotion, and conservative rejection. optimize run, resumableoptimize watch, and lock-awareoptimize supervise.- Workflow library state with active policy, active mutations, rollback snapshots, quarantine, retirement, outcome memory, and conflict receipts.
- Model-agnostic adapters that emit observation events rather than evaluator judgments.
- JSON Schema export and conformance fixtures.
- Ollama
gemma4:e4bexperiment profiles with null, inconclusive, positive, interrupted, and strong-baseline negative results retained.
Not implemented or not claimed:
- No model-weight training or fine-tuning.
- No semantic truth proof.
- No sandbox guarantee.
- No unconstrained network, financial, communication, secret-touching, or irreversible effects by default.
- No active promotion from synthetic/demo evidence.
- No claim that OASG universally improves all agents or all task distributions.
- No claim that
gemma4:e4bbecame more intelligent.
Requirements:
- Python
>=3.12 uv
From a fresh checkout:
uv sync
uv run oasg demo quickstart
uv run oasg doctor
uv run oasg conformance run examples/conformanceInspect the generated quickstart artifacts:
uv run oasg ledger verify examples/quickstart/baseline.jsonl
uv run oasg reduce examples/quickstart/candidate.jsonl --out examples/quickstart/reducer_snapshot.json
uv run oasg klb examples/quickstart/reducer_snapshot.json --out examples/quickstart/klb_receipt.json
uv run oasg gate --baseline examples/quickstart/baseline.jsonl --candidate examples/quickstart/candidate.jsonl --contract examples/quickstart/comparison_contract.json --workload examples/quickstart/workload_manifest.json --witnesses examples/quickstart/positive_evidence_witnesses.jsonDefault runtime behavior is local-only and network-free.
| goal | start here |
|---|---|
| understand the concept in 5 minutes | docs/quick_mental_model.md |
| inspect the core receipts | uv run oasg demo quickstart |
| see the shortest agent insertion point | examples/minimal_agent_integration |
| verify a ledger from another implementation | uv run oasg ledger verify history.jsonl |
| wrap an existing agent | Use OASG With Your Agent |
| run a local optimization cycle | uv run oasg optimize run --history history.jsonl --library workflow_library.json --out-dir .oasg/run |
| run repeated local supervision | uv run oasg optimize supervise --history history.jsonl --library workflow_library.json --state optimizer_state.json --out-dir .oasg/supervise |
| reproduce the current evidence | Experiments and Evidence |
OASG does not require a specific model provider. Your agent, model wrapper, tool runner, or workflow engine only needs to emit observable events into an OASG JSONL ledger.
For a quick local ledger:
uv run oasg observe --out history.jsonl --workflow-id my_agent --component-id planner --dimension budget=acceptable --action pure_read=acceptable --assume-complete--assume-complete is a demo shortcut. In real workflows, emit the relevant dimensions, action
classes, resources, retry counts, validation results, rollback/evidence receipts, and unresolved
obligations explicitly. Missing data fails closed.
uv run oasg pressure history.jsonl --out pressure_vector.json
uv run oasg scheduler history.jsonl --out scheduler_state.jsonPressure is diagnostic and typed. It is not a scalar reward and cannot by itself promote a mutation.
uv run oasg harness init --out oasg_harness.pyReplace the template body with your actual local workflow trial. The command must be deterministic enough for your use case and must emit a sealed OASG JSONL trial ledger. Promotion evidence must come from runner-produced trial ledgers, not from mutation metadata or model text.
uv run oasg optimize run --history history.jsonl --library workflow_library.json --out-dir .oasg/run --cycles 1 --runner local-command --runner-arg python --runner-arg oasg_harness.py --runner-arg --mutation --runner-arg "{mutation}" --runner-arg --candidate --runner-arg "{candidate}"
uv run oasg library status --library workflow_library.jsonThe optimizer performs reduce, KLB, pressure, scheduling, mutation proposal, runner-backed shadow/lease trial derivation, comparison over observed trial ledgers, witness creation, gate evaluation, and workflow-library update. If receipts are incomplete, the result is rejected or inconclusive.
uv run oasg optimize supervise --history history.jsonl --library workflow_library.json --state optimizer_state.json --out-dir .oasg/supervise --max-iterations 1 --runner local-command --runner-arg python --runner-arg oasg_harness.py --runner-arg --mutation --runner-arg "{mutation}" --runner-arg --candidate --runner-arg "{candidate}" --append-lease-observations
uv run oasg optimize state --state optimizer_state.json
uv run oasg library history --library workflow_library.jsonThe supervisor tracks consumed ledger prefixes, pending trials, scheduler state, mutation outcome memory, library hashes, and append receipts. If history shrinks, forks, or disagrees with the saved prefix, it emits a stale/fork receipt and does not promote.
uv run oasg init
uv run oasg doctor
uv run oasg schema export --out schemas
uv run oasg schema policy --out policy_profile.json
uv run oasg ledger verify history.jsonl
uv run oasg ledger append --ledger history.jsonl --records new_events.jsonl --out history.jsonl
uv run oasg reduce history.jsonl --out reducer_snapshot.json
uv run oasg klb reducer_snapshot.json --out klb_receipt.json
uv run oasg pressure history.jsonl --out pressure_vector.json
uv run oasg scheduler history.jsonl --out scheduler_state.json
uv run oasg compare --baseline baseline.jsonl --candidate candidate.jsonl --out-dir comparison
uv run oasg witness --coordinate KLB_2.pure_read --candidate-snapshot comparison/candidate_snapshot.json --candidate-klb comparison/candidate_klb_receipt.json --contract comparison/comparison_contract.json --workload comparison/workload_manifest.json --out comparison/positive_evidence_witnesses.json
uv run oasg gate --baseline baseline.jsonl --candidate candidate.jsonl --contract comparison/comparison_contract.json --workload comparison/workload_manifest.json --witnesses comparison/positive_evidence_witnesses.json
uv run oasg mutate plan --out-dir mutation --mutation-id mut_001 --coordinate KLB_2.pure_read --action-id pure_read
uv run oasg mutator profile init --out mutators.json
uv run oasg workload manifest --baseline baseline.jsonl --candidate candidate.jsonl --out-dir comparison
uv run oasg workload run --mutation mutation/mutation_record.json --candidate candidate.jsonl --workload comparison/workload_manifest.json --out-dir .oasg/workload --runner ledger-replay --trial-ledger-out observed_trial.jsonl
uv run oasg trial run --phase shadow --mutation mutation/mutation_record.json --candidate candidate.jsonl --workload comparison/workload_manifest.json --out-dir .oasg/trial --runner ledger-replay --trial-ledger observed_trial.jsonl
uv run oasg optimize plan --history history.jsonl --library workflow_library.json --out-dir .oasg/plan
uv run oasg optimize run --history history.jsonl --library workflow_library.json --out-dir .oasg/run --cycles 1
uv run oasg optimize watch --history history.jsonl --library workflow_library.json --state optimizer_state.json --out-dir .oasg/watch --max-iterations 1
uv run oasg optimize supervise --history history.jsonl --library workflow_library.json --state optimizer_state.json --out-dir .oasg/supervise --max-iterations 1
uv run oasg experiment verify-longrun --run-dir experiment/ollama_gemma4_e4b_longrun/runs/latest --out experiment/ollama_gemma4_e4b_longrun/results
uv run oasg experiment diagnose-promotion --run-dir experiment/ollama_gemma4_e4b_longrun/runs/latest --out experiment/ollama_gemma4_e4b_longrun/results
uv run oasg conformance run examples/conformanceOperational commands emit deterministic JSON receipts where possible.
Adapters are convenience wrappers. They are outside the trusted gate and cannot create positive promotion evidence by themselves.
Included examples:
oasg.adapters.invoke_command: local subprocess observation wrapper.oasg.adapters.invoke_function: Python callable observation wrapper.oasg.adapters.openai_compatible.invoke_openai_compatible: optional OpenAI-compatible HTTP request wrapper.
The safe pattern is:
- call your model or tool;
- convert the result into a
ModelEvent; - seal it into an OASG event record;
- append the record to the observable ledger;
- let reducers, gates, and trial receipts decide whether workflow policy can change.
Local Ollama experiments in this repository use only localhost Ollama as the model endpoint.
OASG is not a replacement for an agent framework. It can sit beside one:
- plain Python: wrap a function or model call and append an OASG event;
- LangGraph: LangGraph handles durable execution and resume, OASG handles promotion gates;
- CrewAI: CrewAI handles crew/task execution, OASG observes outcomes and gates policy changes;
- any provider: emit JSONL observations and keep provider output outside the trusted gate.
See examples/framework_adapters for dependency-free adapter
patterns. LangGraph and CrewAI are optional examples, not package dependencies.
Common statuses:
rejected_no_concrete_positive_evidence: an improved coordinate lacks a valid sidecar witness.rejected_floor_violation: a protected floor regressed.rejected_contaminated_comparison: baseline/candidate workload pairing is not equivalent.rejected_effect_policy: the mutation requests a disallowed effect or promotion class.rejected_semantic_floor_missing: a claim-emitting action lacks a semantic-floor policy.rejected_secret_taint: secret or unknown-secret taint reached a protected action.inconclusive_klb_overflow: boundedKLB_2enumeration exceeded the profile cap.no_valid_candidate: optimizer found no candidate with complete gate, shadow, lease, and witness receipts.no_new_work: watch/supervise saw the same append index and ledger prefix as the prior checkpoint.stale_optimizer_state: saved optimizer state and current ledger prefix/append index disagree.library_conflict: workflow library changed between load and atomic write.
Rejection is not a runtime error in OASG. It is often the correct fail-closed result.
The repository includes local Ollama gemma4:e4b experiments. They are designed to test workflow
operation, not model intelligence. All reported runs used deterministic operational validators and
kept failed, rejected, and inconclusive receipts.
Current evidence bottom line:
- OASG showed a practical workflow-operation improvement over a deliberately weak fixed baseline in the decisive experiment.
- OASG did not show an incremental improvement over a calibration-selected strong static baseline on a fixed held-out distribution in the strong-baseline v2 experiment.
- OASG did show time-boxed post-drift recovery over a calibration-selected strong static workflow in the nonstationary strong-baseline protocol.
- Therefore, the scientifically honest claim is conditional: this implementation has positive evidence for workflow adaptation when operational requirements drift, while fixed-distribution strong-baseline evidence remains negative.
| experiment | classification | key result | interpretation |
|---|---|---|---|
experiment/ollama_gemma4_e4b_pilot |
no_clear_effect |
12 tasks; baseline and adaptive both closed 8/12; active promotions 0 | Initial pilot did not establish adaptation. |
experiment/ollama_gemma4_e4b_pilot effect profile |
no_clear_effect |
48 held-out eval tasks; baseline and adaptive both closed 26/48; active promotions 0 | Workflow-sensitive design still did not activate promotion. |
experiment/ollama_gemma4_e4b_longrun |
inconclusive_no_active_policy |
baseline 276/408 closed; observe-only 277/408 closed; adaptive evaluation was not run because active promotions 0 | Long-run measurement correctly refused to claim OASG effect. |
experiment/ollama_gemma4_e4b_definitive |
workload_not_sensitive |
mechanism qualification blocked Stage B; no effect claim | The positive-control policy did not establish a useful measurement workload. |
experiment/ollama_gemma4_e4b_decisive |
oasg_effect_confirmed |
5 seeds, 680 paired held-out tasks; adaptive debt AUC 2040 -> 921; closure 0 -> 337; hard-floor regressions 0 | Under this preregistered weak-baseline workload, OASG adaptive produced a practical workflow-operation improvement. |
experiment/ollama_gemma4_e4b_strong_baseline |
promotion_mechanism_failure_vs_strong_baseline |
strong baseline qualified; adaptive readiness active seeds 0/4 required; run interrupted after 7/25 held-out condition blocks | No incremental OASG effect over the strong baseline is claimed. The run was stopped because adaptive activation failed before evaluation, making the primary effect question non-identifiable. |
experiment/ollama_gemma4_e4b_strong_baseline_v2 |
no_incremental_effect_vs_strong_baseline |
5 seeds, 680 paired held-out tasks; strong static debt AUC 434; OASG adaptive debt AUC 436; debt delta +2, CI [0, 5]; cost delta +7652, CI [1534, 14346]; hard-floor regressions 0 |
Readiness succeeded, but held-out evaluation did not show incremental OASG value over the calibrated strong static workflow. |
experiment/ollama_gemma4_e4b_nonstationary_strong_baseline |
oasg_nonstationary_effect_confirmed_timeboxed |
2 seeds, 48 paired post-drift tasks; strong static debt AUC 112; OASG adaptive debt AUC 84; debt delta -28, CI [-51, -10]; closure 20/48 -> 27/48; hard-floor regressions 0 |
Time-boxed positive evidence that fail-closed OASG adaptation recovered post-drift operational debt over a calibration-selected strong static workflow. The claim is limited to this frozen protocol and is not universal. |
experiment/ollama_gemma4_e4b_nonstationary_confirmatory |
protocol added; no confirmatory effect claim yet | Four-variant follow-up protocol; mock/small wiring run classifies inconclusive_insufficient_power |
Audited to separate broad drift support, structural-only support, mixed-reversion/policy-retirement support, and cost-aware effects. A real all-variant Ollama run is required before any effect claim. |
The decisive run is the strongest weak-baseline positive evidence in this repository.
Artifacts:
- results report:
experiment/ollama_gemma4_e4b_decisive/results/report.md - metrics:
experiment/ollama_gemma4_e4b_decisive/results/metrics.json - verification:
experiment/ollama_gemma4_e4b_decisive/results/verification.json - promotion diagnostic:
experiment/ollama_gemma4_e4b_decisive/results/promotion_diagnostic.json
Condition summary from the decisive run:
| condition | tasks | closed | debt AUC | parse failures | validation failures | unresolved obligations | active mutations |
|---|---|---|---|---|---|---|---|
baseline_fixed |
680 | 0 | 2040 | 680 | 680 | 680 | 0 |
oasg_observe_only |
680 | 0 | 2040 | 680 | 680 | 680 | 0 |
forced_policy_positive_control |
680 | 463 | 434 | 0 | 217 | 217 | 0 |
oasg_adaptive |
680 | 337 | 921 | 235 | 343 | 343 | 6 |
Paired effects:
- Adaptive vs baseline debt AUC delta:
-1119. - Adaptive vs baseline debt AUC reduction:
54.85%. - Bootstrap CI for adaptive-baseline debt delta:
[-1179, -1050]. - Forced positive-control vs baseline debt AUC delta:
-1606. - Verification status:
ok. - Invalid ledgers: none reported.
- Active seeds:
5/5. - Active mutation ids:
mut_family_safe_expr_prompt_safe_python_expressionmut_receipt_template_only_replay_rollback_receiptmut_receipt_template_only_validator_receiptmut_schema_keys_only_json_schema_repairmut_schema_keys_only_obligation_closuremut_strict_json_minimal_code_transform
Scientific interpretation:
- This is positive evidence that OASG can reduce observable operational debt in the tested
gemma4:e4bworkflow setting. - It is not evidence that the model became smarter.
- It is not evidence of universal OASG effectiveness.
- The baseline was intentionally weak and brittle. The result proves improvement over that fixed workflow, not over a strong hand-tuned production workflow.
- The forced positive-control was better than OASG adaptive, so OASG did not find the full available policy improvement. It found a substantial subset.
- The observe-only condition matched baseline, which supports the interpretation that improvement came from active workflow-policy promotion, not from measurement alone.
Strong-baseline follow-up:
- A later strong-baseline protocol qualified a strong static workflow, but OASG did not produce any runner-ledger-backed active policy change from that strong starting point.
- That run was interrupted after readiness failure, with classification
promotion_mechanism_failure_vs_strong_baseline. - This is negative evidence for the current implementation's ability to add incremental value over that strong static workflow, not a general proof that OASG cannot help stronger baselines.
- Artifacts:
experiment/ollama_gemma4_e4b_strong_baseline/results/20260511T113612Z_interrupted/report.mdandexperiment/ollama_gemma4_e4b_strong_baseline/results/20260511T113612Z_interrupted/interruption_receipt.json.
Strong-baseline v2 protocol:
- The v2 profile added an explicit
incremental_headroomgate and then completed held-out evaluation after readiness passed. - Stage 0:
strong_baseline_qualified; the strong static policy reduced calibration debt AUC by7861bps versus the weak fixed baseline. - Stage 1:
debt_headroom_exists; calibration canaries found 43 incremental candidates. - Stage 2:
adaptive_from_strong_ready; active changes appeared in all 5 seeds. - Stage 3: held-out evaluation did not show incremental gain over strong static:
strong_static_calibrated: debt AUC434, cost units1580136, closed463/680.oasg_adaptive_from_strong: debt AUC436, cost units1587788, closed463/680.- primary debt delta
+2, debt CI[0, 5]; primary cost delta+7652, cost CI[1534, 14346].
- Final classification:
no_incremental_effect_vs_strong_baseline. - Interpretation: this is negative evidence for incremental value over this strong static workflow, not evidence that OASG cannot help all strong baselines.
- Curated artifacts:
experiment/ollama_gemma4_e4b_strong_baseline_v2/results/report.md,experiment/ollama_gemma4_e4b_strong_baseline_v2/results/metrics.json, andexperiment/ollama_gemma4_e4b_strong_baseline_v2/results/verification.json.
Nonstationary strong-baseline protocol:
- This profile tests the narrower OASG claim that adaptation should matter when a strong static workflow is calibrated on Phase A but later faces ordered workload drift.
- Final classification:
oasg_nonstationary_effect_confirmed_timeboxed. - Integrity: verification
ok, paired post-drift task count48, hard-floor regressions0. - Primary result:
strong_static_calibrated: debt AUC112, cost units148517, closed20/48.oasg_adaptive_from_strong: debt AUC84, cost units137059, closed27/48.- debt delta
-28, debt CI[-51, -10]; cost delta-11458, cost CI[-31074, 7272]. - adaptation lag: Phase B
1epoch, Phase C0, Phase D0.
- Secondary controls:
- OASG vs observe-only debt delta
-29, CI[-52, -11]. - OASG vs rule-adaptive debt delta
-30, CI[-50, -12].
- OASG vs observe-only debt delta
- Interpretation: this is positive time-boxed evidence for fail-closed post-drift workflow adaptation over a strong static workflow. It does not contradict the fixed-distribution strong-baseline v2 negative result; it narrows the claim to nonstationary operation.
- Limits: only 2 seeds, controlled synthetic operational drift, local
gemma4:e4b, deterministic validators, and repository-defined thresholds. It is not universal evidence and does not imply model intelligence improvement. - Curated artifacts:
experiment/ollama_gemma4_e4b_nonstationary_strong_baseline/results/report.md,experiment/ollama_gemma4_e4b_nonstationary_strong_baseline/results/metrics.json, andexperiment/ollama_gemma4_e4b_nonstationary_strong_baseline/results/verification.json.
Requires local Ollama with gemma4:e4b installed.
cd path\to\oasg
uv sync
ollama list
uv run python experiment\ollama_gemma4_e4b_decisive\scripts\run_decisive_experiment.py --config experiment\ollama_gemma4_e4b_decisive\config_decisive.json
uv run python experiment\ollama_gemma4_e4b_decisive\scripts\analyze_decisive_results.py --run-dir experiment\ollama_gemma4_e4b_decisive\runs\latest --out experiment\ollama_gemma4_e4b_decisive\resultsThe effect claim is limited to the frozen workload, model, prompts, validators, implementation, and decision thresholds in that experiment profile.
Requires local Ollama with gemma4:e4b installed. The default config is a short time-boxed
nonstationary protocol, not a universal benchmark.
cd path\to\oasg
uv sync
ollama list
uv run python experiment\ollama_gemma4_e4b_nonstationary_strong_baseline\scripts\run_nonstationary_experiment.py --config experiment\ollama_gemma4_e4b_nonstationary_strong_baseline\config_nonstationary.json
uv run python experiment\ollama_gemma4_e4b_nonstationary_strong_baseline\scripts\analyze_nonstationary_results.py --run-dir experiment\ollama_gemma4_e4b_nonstationary_strong_baseline\runs\latest --out experiment\ollama_gemma4_e4b_nonstationary_strong_baseline\resultsBefore publishing a change or port:
uv run pytest
uv run ruff check
uv run mypy src
uv run oasg conformance run examples/conformanceAt the time this README was updated after auditing the confirmatory nonstationary protocol, these
checks passed in the current workspace: 111 passed, ruff clean, mypy clean, and conformance
status: ok.
The current public-readiness review is recorded in
docs/publication_audit.md.
If you use OASG, cite the archived software release:
- DOI: 10.5281/zenodo.20107660
- Repository: github.com/kadubon/oasg
- Citation metadata:
CITATION.cff
cff-version: 1.2.0
title: "OASG: Observable-only Autonomic Slack Gradient for Local-first AI Agent Workflow Optimization"
version: 1.1.0
doi: 10.5281/zenodo.20107660
repository-code: "https://github.com/kadubon/oasg"AI agents, agent workflow optimization, long-running agents, local-first AI, model-agnostic agent framework, no LLM judge, observable ledgers, deterministic reducers, workflow policy optimization, autonomic agents, JSONL ledger, canonical hashing, Ollama experiments, Python uv.
theory.md v1.0 theory and specification
docs/quick_mental_model.md five-minute engineering mental model
src/oasg/canonical.py canonical JSON and hash domains
src/oasg/ledger.py JSONL sealing and prefix verification
src/oasg/reducers/ deterministic reducers
src/oasg/pressure.py typed pressure vector calculation
src/oasg/scheduler.py pressure scheduling and fairness state
src/oasg/mutators.py workflow-policy mutation proposals
src/oasg/optimizer.py run/watch/supervise optimizer loops
src/oasg/optimizer_state.py durable optimizer checkpoints
src/oasg/library.py workflow library state, rollback, quarantine
src/oasg/policy_state.py structured workflow policy and mutation patches
src/oasg/harness.py local harness scaffold
src/oasg/policy_effects.py demo-only policy-patch smoke semantics
src/oasg/runners.py ledger-replay/demo-replay/local-command runners
src/oasg/klb.py bounded KLB_2 enumeration
src/oasg/gate.py dominance gate and witness validation
src/oasg/schemas/ JSON Schema export
src/oasg/adapters/ model/tool connector contracts
examples/ quickstart and conformance fixtures
examples/minimal_agent_integration/ shortest agent-to-ledger-to-gate example
examples/framework_adapters/ optional plain Python, LangGraph, and CrewAI patterns
experiment/ Ollama experiment protocols and results
tests/ unit, integration, and experiment-script tests
Apache-2.0.