Refresh README hero and add artifact-backed benchmark snapshot#79
Refresh README hero and add artifact-backed benchmark snapshot#79ProfRandom92 wants to merge 5 commits into
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
✅ Deploy Preview for comptext-v7 canceled.
|
There was a problem hiding this comment.
Code Review
This pull request significantly refactors the README.md to focus on deterministic operational memory and CI-audited replay checks, removing extensive research-oriented descriptions in favor of a concise benchmark snapshot and integrity model. A review comment identifies a potential inconsistency regarding the project's limitations, noting that the document lists iterative replay degradation as a future target despite existing reports covering up to 250 iterations.
| - Current benchmarks use curated fixtures, not broad production traffic. | ||
| - This is not solved AI memory and does not claim general long-term recall. | ||
| - This is not an autonomous agent framework. | ||
| - Iterative replay degradation is the next validation target. |
There was a problem hiding this comment.
The limitation "Iterative replay degradation is the next validation target" appears to be inconsistent with the "Primary replay-continuity benchmark" described later in the document (line 73) and the results already present in reports/replay_continuity/validation_report.md, which explicitly covers up to 250 iterations. If this limitation refers to a specific subset of benchmarks (e.g., the Paper or Agent Trace benchmarks) or a higher scale of iteration, please clarify it to avoid confusing reviewers.
Motivation
Description
# Comptextv7and the one-line descriptor; removed decorative and dead image markup. The only modified file isREADME.md.shields.ioforPythonandCI, and static badges forDeterministic Replay,No LLM Judging,Replay Artifacts, andOperational State.artifacts/paper_replay_results.jsonandartifacts/agent_trace_replay_results.json(paper:3 papers,avg_compression_ratio1.347063,replay_consistency0.791667; agent:3 traces,avg_compression_ratio1.773954,replay_consistency1.000000,operational_drift0.000000).What exists nowcapability table, anIntegrity modelsection describing determinism and auditability, and a shortLimitationssection that remains factual and non-hype.Testing
python -m pytest tests/test_replay_continuity.pywhich completed successfully (8 passed).README.mdmatch the committed JSON artifacts and confirmed decorative image markup was removed, and that check passed.Codex Task