Skip to content

feat(research): gap validation protocol — automated sweep + decision matrix#3734

Open
ryanklee wants to merge 2 commits into
mainfrom
beta/research-gap-validation-protocol
Open

feat(research): gap validation protocol — automated sweep + decision matrix#3734
ryanklee wants to merge 2 commits into
mainfrom
beta/research-gap-validation-protocol

Conversation

@ryanklee
Copy link
Copy Markdown
Collaborator

@ryanklee ryanklee commented May 26, 2026

Summary

  • Adds scripts/gap-validate.py — automated sweep that scores research gaps against a decay/evidence/actionability rubric and produces a ranked decision matrix
  • Adds docs/research/gap-portfolio-registry.yaml — structured registry of research gaps with schema validation
  • Adds docs/research/gap-portfolio-SCHEMA.md — schema documentation for the registry format
  • Adds docs/research/gap-validation-observation-guide.md — observation protocol for validating gap claims
  • Adds scripts/gap-decay-report.py — decay monitoring for gap freshness
  • Adds tests/scripts/test_gap_validate.py — unit tests for the validation logic

Test plan

  • uv run pytest tests/scripts/test_gap_validate.py -q passes
  • uv run ruff check scripts/gap-validate.py scripts/gap-decay-report.py clean
  • uv run python scripts/gap-validate.py --help shows usage

🤖 Generated with Claude Code

ryanklee and others added 2 commits May 20, 2026 21:52
…matrix

Restore Phase 1 gap registry files (accidentally deleted in #3580) and build
Phase 2 validation tooling: CLI `gap-validate.py` with 4-source automated
sweep (OpenAlex patents, GitHub code search, Semantic Scholar, Papers with
Code), 6-signal decision matrix (4-of-6 = high confidence), Phase 2 community
probe scaffolding (forum + cold-email templates), and Phase 3 practitioner
observation guide (7-question contextual inquiry protocol). Batch-validated
GAP-001, GAP-003, GAP-007 — all high_confidence_novel. 16 tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 26, 2026

📝 Walkthrough

Walkthrough

This PR establishes a research gap validation platform with a curated registry of 18 research gaps, automated prior-art detection across four sources, scaffolding generators for outreach campaigns, practitioner observation protocols, and supporting utilities including a gap decay report tool and comprehensive test coverage.

Changes

Gap Validation and Registry System

Layer / File(s) Summary
Gap Registry Schema and Data
docs/research/gap-portfolio-SCHEMA.md, docs/research/gap-portfolio-registry.yaml, docs/research/gap-validation-observation-guide.md
Gap portfolio record structure specifies fields for identifiers, scoring metrics, decay rates, and review metadata. Registry metadata includes schema versioning and WIP limit. Populates 18 gaps (GAP-001 through GAP-018) with titles, justifications, disposition, validation status, and decay parameters. Observation guide documents Phase 3 practitioner interview protocol with participant selection, question flow, scoring rules, and ethics requirements.
Phase 1 Automated Validation Sweeps
scripts/gap-validate.py (entry, types, registry loading, search expansion, GitHub/Semantic Scholar/patents/papers-with-code sweep functions, decision matrix)
Implements four parallel sweeps querying GitHub code search, Semantic Scholar API, OpenAlex patents, and Papers with Code HTTP API. Each sweep returns vote (novel/prior_art_exists/inconclusive), confidence score, and evidence URLs. Builds search terms from gap title and apparatus justification with domain-specific keyword expansion. Orchestrates all sweeps together, applies decision matrix logic over vote distribution, and logs vote/confidence metrics.
Phase 2 & 3 Output Generation
scripts/gap-validate.py (results persistence, registry update, forum/email/scaffolding generation, observation guide content)
Writes Phase 1 sweep results to timestamped JSON in local vault. Updates gap registry validation_status and last_reviewed timestamp. Generates forum post and cold email templates embedding gap ID, title, justification, and placeholder sections for sweep findings. Includes static Phase 3 practitioner observation guide markdown.
CLI Orchestration
scripts/gap-validate.py (main function and subcommand wiring)
Exposes subcommands: sweep (run Phase 1 for single gap), scaffold (generate Phase 2 templates), observation-guide (output Phase 3 protocol), and batch (iterate multiple gap IDs with inter-request delays). Includes -v/--verbose log level control, optional registry mutation flag, and appropriate exit codes.
Gap Validation Tests
tests/scripts/test_gap_validate.py
Dynamic import of gap-validate module, sample gap and registry fixtures. Unit tests verify search term building includes title and domain keywords; decision matrix logic across vote/confidence distributions; gap lookup by ID; SignalResult default fields; forum/email templates include gap ID and metadata; observation guide contains 7+ questions and required sections; registry loading returns expected gap counts and validation_status distribution.

Gap Decay Report Utility

Layer / File(s) Summary
Decay Report Tool
scripts/gap-decay-report.py
Loads registry YAML, computes review-based expiry date using last_reviewed plus decay_rate_halflife_days. Filters gaps expiring within configurable horizon (default 90 days), sorts by remaining days. Outputs as indented JSON (with --json flag) or human-readable table listing gap ID, remaining days, halflife, disposition, and uniqueness score, with EXPIRED/soon status indicators.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 Whiskers twitch with glee—a warren of research gaps now lies mapped,
Four sweeps scurry cross the web seeking prior-art traps,
Forum and email templates ready for the hunt,
While Phase Three observers await their turn upfront.
Decay halflife counts the days—a thoughtful system truly wrapped!

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ⚠️ Warning The PR description includes a summary of all major changes and a test plan with specific commands, but completely omits the required AuthorityCase section. Add the AuthorityCase section with Case and Slice identifiers (or 'pre-methodology' if applicable), and complete CLAUDE.md hygiene checklist items.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: introducing an automated gap validation protocol with a decision matrix.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch beta/research-gap-validation-protocol

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a0c13c2837

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread scripts/gap-validate.py
text=True,
timeout=SWEEP_TIMEOUT,
)
if proc.returncode == 0 and proc.stdout.strip():
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Fail code-search signal when gh query errors

This branch only handles proc.returncode == 0; any non-zero gh api exit (auth missing, rate-limit, HTTP error) is silently treated as “no results,” which then falls through to a novel vote when unique_repos is empty. That can produce false novelty decisions from infrastructure/auth failures rather than real prior-art absence, so non-zero exits should return an inconclusive/error signal instead of being scored.

Useful? React with 👍 / 👎.

Comment thread scripts/gap-validate.py
Comment on lines +169 to +170
if resp.status_code == 200:
data = resp.json()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Treat non-200 API responses as inconclusive

The sweeper only processes status 200 and otherwise continues without recording an error, so 429/403/5xx responses are interpreted like empty search results and can be scored as novel. This can systematically inflate novelty confidence during transient API failures or throttling; non-200 responses should be surfaced as inconclusive/error signals instead of silently ignored.

Useful? React with 👍 / 👎.

- gap_id: GAP-002
title: Epistemic quality infrastructure
request_ref: REQ-20260512-epistemic-quality-infrastructure
disposition: execute
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Enforce single active execute gap in registry

The registry declares wip_limit: 1 and the schema states exactly one gap may be disposition: execute, but this entry introduces a second execute gap alongside GAP-001. That breaks the documented invariant and makes downstream tooling/reporting ambiguous about which gap is the single active execution target.

Useful? React with 👍 / 👎.

Comment thread scripts/gap-validate.py
Comment on lines +239 to +240
except httpx.HTTPError:
pass
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Stop swallowing patent sweep transport errors

This except block drops httpx transport failures and keeps scoring as if the query simply found no prior art. If one or more patent lookups fails due to timeout/DNS/network issues, the function can still fall through to a novel vote, which misclassifies infrastructure failure as evidence of novelty.

Useful? React with 👍 / 👎.

Comment thread scripts/gap-validate.py
item = json.loads(line)
all_results.append(item)
source_urls.append(
f"https://github.com/{item['repo']}/blob/main/{item['path']}"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Build GitHub evidence URLs from actual branch

Evidence links are hardcoded to /blob/main/, but many repositories use a different default branch (for example master or trunk). In those cases the stored URLs 404, so reviewers cannot verify the supposed prior-art evidence from sweep output.

Useful? React with 👍 / 👎.

Comment thread scripts/gap-validate.py
Comment on lines +222 to +223
"filter": "type:patent",
"per_page": 10,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Query OpenAlex with supported work types only

This request filters on type:patent, but OpenAlex’s documented work types do not include patent (they include values like article, preprint, report, standard, etc.). As a result, the patent sweep query is invalid and cannot return intended results, biasing the patents signal toward false novelty/inconclusive outcomes.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (2)
scripts/gap-validate.py (2)

517-608: ⚡ Quick win

Avoid duplicating observation-guide content in code and docs.

The full guide is embedded here and also maintained in docs/research/gap-validation-observation-guide.md, creating drift risk. Load from the canonical doc file or keep one source-of-truth template.

Also applies to: 611-613

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/gap-validate.py` around lines 517 - 608, The
OBSERVATION_GUIDE_CONTENT constant duplicates the canonical doc; replace the
hardcoded multi-line string by loading the contents of the canonical markdown
(docs/research/gap-validation-observation-guide.md) at runtime (or during script
initialization) and assign that text to OBSERVATION_GUIDE_CONTENT (with a clear
fallback that logs an error if the doc is missing). Locate
OBSERVATION_GUIDE_CONTENT in scripts/gap-validate.py, implement a small helper
(e.g., read_observation_guide or similar) to read the file once, and ensure any
tests or downstream consumers still reference the same symbol.

32-33: ⚡ Quick win

Make output directory configurable instead of hardcoding a personal path.

VAULT_OUTPUT_DIR points to a user-specific Documents path, which is brittle for CI and other operators. Prefer a CLI flag/env var with a sensible repo-local default.

Also applies to: 399-400, 505-507

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/gap-validate.py` around lines 32 - 33, Replace the hardcoded personal
path assigned to VAULT_OUTPUT_DIR with a configurable option: read from an
environment variable (e.g., GAP_VAULT_OUTPUT_DIR) or a CLI flag, falling back to
a sensible repo-local default such as REPO_ROOT / "output" or REPO_ROOT /
"vault_output"; update all other occurrences that use the same personal path
(the other VAULT_OUTPUT_DIR assignments/usages around the later blocks) so they
reference the new configurable variable or flag parsing logic instead of the
hardcoded Path.home() location, and ensure any code that writes output creates
the directory if missing.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/research/gap-portfolio-registry.yaml`:
- Line 4: The WIP invariant fails because wip_limit: 1 conflicts with two
records marked disposition: execute (GAP-001 and GAP-002 — also applies to the
entries at 18-18 and 30-30); fix by either increasing the wip_limit value to at
least 2 (update the wip_limit key) or changing one or more of those records'
disposition from execute to a non-active state (e.g., backlog/defer) so the
number of execute dispositions does not exceed wip_limit; update the wip_limit
or the disposition fields for GAP-001/GAP-002 (and the entries at 18-18 and
30-30) accordingly.

In `@scripts/gap-decay-report.py`:
- Around line 17-18: The load_registry function currently calls path.read_text
and yaml.safe_load without error handling; wrap the body of load_registry (the
function that takes path: Path = REGISTRY) in a try/except that catches file I/O
errors (FileNotFoundError, PermissionError, UnicodeDecodeError) and
yaml.YAMLError, log or raise a clear, user-friendly message that includes the
path and the underlying exception, and either return a sensible default (e.g.,
empty dict) or re-raise a custom exception to preserve upstream behavior; ensure
you reference REGISTRY and Path in the error text so it's easy to locate the
problematic file.
- Around line 34-36: The code directly accesses gap["gap_id"], gap["title"], and
gap["disposition"] which will raise KeyError for malformed records; update the
gap-processing logic (the block that builds the dict with
"gap_id"/"title"/"disposition") to validate presence of these keys before using
them—either use gap.get("gap_id")/get("title")/get("disposition") and detect
missing values, or explicitly check "gap_id" in gap etc., then log or raise a
clear error and skip the record (or provide a default) so the script won't crash
on missing fields.
- Line 28: The code calls datetime.fromisoformat(reviewed) to produce
reviewed_dt without validation; wrap that call in a try/except that catches
ValueError (and optionally TypeError) to handle invalid or empty last_reviewed
values, log a clear error mentioning the offending last_reviewed string and the
record identifier, and either skip that record or exit with a non-zero status
depending on desired behavior; update the code around reviewed_dt =
datetime.fromisoformat(reviewed) to perform this validation and error handling
(or use a safe parser like dateutil.parser.parse inside the same try/except) so
the script no longer crashes on malformed ISO dates.

In `@scripts/gap-validate.py`:
- Line 117: The current check using "if proc.returncode == 0 and
proc.stdout.strip()" silently treats failures or empty outputs as success/novel;
instead, treat any nonzero return code or empty stdout as an explicit
inconclusive result and attach the captured stderr/stdout as error context.
Update every similar branch (where you check proc.returncode == 0 or status_code
== 200) to: evaluate proc.returncode, proc.stdout, and proc.stderr, set the
result status to "inconclusive" when returncode != 0 or stdout is empty, and
include proc.stderr (and proc.stdout) in the returned/recorded error message so
external failures (auth/rate-limit/API errors) are not mis-scored as novel.
Ensure you apply this change for the occurrences around the symbols
proc.returncode, proc.stdout, proc.stderr and the analogous HTTP checks
(status_code) noted in the comment.

In `@tests/scripts/test_gap_validate.py`:
- Around line 140-143: The test test_guide_content_has_7_questions is counting
bold markers via content.count("**") instead of actual questions; update it to
scan OBSERVATION_GUIDE_CONTENT for real question items (e.g., use a regex or
line-based check against patterns like lines ending with '?' and/or lines
beginning with 'Q:' or a numbered question prefix) and count those matches, then
assert the count >= 7; modify the assertion in
test_guide_content_has_7_questions to use that more accurate question-match
logic against OBSERVATION_GUIDE_CONTENT.

---

Nitpick comments:
In `@scripts/gap-validate.py`:
- Around line 517-608: The OBSERVATION_GUIDE_CONTENT constant duplicates the
canonical doc; replace the hardcoded multi-line string by loading the contents
of the canonical markdown (docs/research/gap-validation-observation-guide.md) at
runtime (or during script initialization) and assign that text to
OBSERVATION_GUIDE_CONTENT (with a clear fallback that logs an error if the doc
is missing). Locate OBSERVATION_GUIDE_CONTENT in scripts/gap-validate.py,
implement a small helper (e.g., read_observation_guide or similar) to read the
file once, and ensure any tests or downstream consumers still reference the same
symbol.
- Around line 32-33: Replace the hardcoded personal path assigned to
VAULT_OUTPUT_DIR with a configurable option: read from an environment variable
(e.g., GAP_VAULT_OUTPUT_DIR) or a CLI flag, falling back to a sensible
repo-local default such as REPO_ROOT / "output" or REPO_ROOT / "vault_output";
update all other occurrences that use the same personal path (the other
VAULT_OUTPUT_DIR assignments/usages around the later blocks) so they reference
the new configurable variable or flag parsing logic instead of the hardcoded
Path.home() location, and ensure any code that writes output creates the
directory if missing.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 959703be-f7c1-42cd-a6c5-a34045c53ffc

📥 Commits

Reviewing files that changed from the base of the PR and between 54f981f and a0c13c2.

📒 Files selected for processing (6)
  • docs/research/gap-portfolio-SCHEMA.md
  • docs/research/gap-portfolio-registry.yaml
  • docs/research/gap-validation-observation-guide.md
  • scripts/gap-decay-report.py
  • scripts/gap-validate.py
  • tests/scripts/test_gap_validate.py

schema_version: 1
registry_id: research-gap-portfolio-v1
authority_case: CASE-20260509-RESEARCH-PO
wip_limit: 1
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

WIP invariant is currently violated by two active execute gaps.

wip_limit: 1 conflicts with two records marked disposition: execute (GAP-001 and GAP-002). This breaks the registry contract and can invalidate downstream prioritization logic.

Also applies to: 18-18, 30-30

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/research/gap-portfolio-registry.yaml` at line 4, The WIP invariant fails
because wip_limit: 1 conflicts with two records marked disposition: execute
(GAP-001 and GAP-002 — also applies to the entries at 18-18 and 30-30); fix by
either increasing the wip_limit value to at least 2 (update the wip_limit key)
or changing one or more of those records' disposition from execute to a
non-active state (e.g., backlog/defer) so the number of execute dispositions
does not exceed wip_limit; update the wip_limit or the disposition fields for
GAP-001/GAP-002 (and the entries at 18-18 and 30-30) accordingly.

Comment on lines +17 to +18
def load_registry(path: Path = REGISTRY) -> dict:
return yaml.safe_load(path.read_text(encoding="utf-8"))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add error handling for file I/O and YAML parsing.

If the registry file doesn't exist or contains invalid YAML, the script will crash with an unhelpful stack trace. Wrapping in a try-except block would provide clearer error messages.

🛡️ Proposed fix to add error handling
 def load_registry(path: Path = REGISTRY) -> dict:
+    try:
-    return yaml.safe_load(path.read_text(encoding="utf-8"))
+        return yaml.safe_load(path.read_text(encoding="utf-8"))
+    except FileNotFoundError:
+        print(f"Error: Registry file not found at {path}", file=sys.stderr)
+        sys.exit(1)
+    except yaml.YAMLError as e:
+        print(f"Error: Invalid YAML in registry file: {e}", file=sys.stderr)
+        sys.exit(1)
+    except Exception as e:
+        print(f"Error reading registry: {e}", file=sys.stderr)
+        sys.exit(1)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/gap-decay-report.py` around lines 17 - 18, The load_registry function
currently calls path.read_text and yaml.safe_load without error handling; wrap
the body of load_registry (the function that takes path: Path = REGISTRY) in a
try/except that catches file I/O errors (FileNotFoundError, PermissionError,
UnicodeDecodeError) and yaml.YAMLError, log or raise a clear, user-friendly
message that includes the path and the underlying exception, and either return a
sensible default (e.g., empty dict) or re-raise a custom exception to preserve
upstream behavior; ensure you reference REGISTRY and Path in the error text so
it's easy to locate the problematic file.

for gap in registry.get("gaps", []):
halflife = gap.get("decay_rate_halflife_days", 365)
reviewed = gap.get("last_reviewed", "2026-01-01")
reviewed_dt = datetime.fromisoformat(reviewed)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add error handling for date parsing.

If last_reviewed contains an invalid ISO date format, datetime.fromisoformat() will raise a ValueError and crash the script. Adding validation would provide clearer error messages and prevent script failure.

🛡️ Proposed fix to add date validation
-        reviewed_dt = datetime.fromisoformat(reviewed)
+        try:
+            reviewed_dt = datetime.fromisoformat(reviewed)
+        except ValueError:
+            print(f"Warning: Invalid date format for {gap.get('gap_id', 'unknown')}, skipping", file=sys.stderr)
+            continue
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
reviewed_dt = datetime.fromisoformat(reviewed)
try:
reviewed_dt = datetime.fromisoformat(reviewed)
except ValueError:
print(f"Warning: Invalid date format for {gap.get('gap_id', 'unknown')}, skipping", file=sys.stderr)
continue
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/gap-decay-report.py` at line 28, The code calls
datetime.fromisoformat(reviewed) to produce reviewed_dt without validation; wrap
that call in a try/except that catches ValueError (and optionally TypeError) to
handle invalid or empty last_reviewed values, log a clear error mentioning the
offending last_reviewed string and the record identifier, and either skip that
record or exit with a non-zero status depending on desired behavior; update the
code around reviewed_dt = datetime.fromisoformat(reviewed) to perform this
validation and error handling (or use a safe parser like dateutil.parser.parse
inside the same try/except) so the script no longer crashes on malformed ISO
dates.

Comment on lines +34 to +36
"gap_id": gap["gap_id"],
"title": gap["title"],
"disposition": gap["disposition"],
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add validation for required gap fields.

Direct dictionary access without validation will raise KeyError if required fields (gap_id, title, disposition) are missing from a gap record. Consider validating these fields or using .get() with error handling.

🛡️ Proposed fix to validate required fields
+        # Validate required fields
+        required = ["gap_id", "title", "disposition"]
+        missing = [f for f in required if f not in gap]
+        if missing:
+            print(f"Warning: Gap missing required fields {missing}, skipping", file=sys.stderr)
+            continue
+        
         if expiry <= horizon:
             expiring.append(
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/gap-decay-report.py` around lines 34 - 36, The code directly accesses
gap["gap_id"], gap["title"], and gap["disposition"] which will raise KeyError
for malformed records; update the gap-processing logic (the block that builds
the dict with "gap_id"/"title"/"disposition") to validate presence of these keys
before using them—either use gap.get("gap_id")/get("title")/get("disposition")
and detect missing values, or explicitly check "gap_id" in gap etc., then log or
raise a clear error and skip the record (or provide a default) so the script
won't crash on missing fields.

Comment thread scripts/gap-validate.py
text=True,
timeout=SWEEP_TIMEOUT,
)
if proc.returncode == 0 and proc.stdout.strip():
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

External failures can be mis-scored as novel instead of inconclusive.

Several sweeps only handle success paths and otherwise continue with empty results, which can inflate novelty votes on auth/rate-limit/API failures. Treat non-200/nonzero responses as explicit inconclusive signals with captured error context.

#!/bin/bash
# Verify places where unsuccessful responses may be silently ignored.
rg -n "returncode == 0|status_code == 200|except httpx.HTTPError|pass" scripts/gap-validate.py

Also applies to: 169-170, 227-228, 239-240, 305-306

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/gap-validate.py` at line 117, The current check using "if
proc.returncode == 0 and proc.stdout.strip()" silently treats failures or empty
outputs as success/novel; instead, treat any nonzero return code or empty stdout
as an explicit inconclusive result and attach the captured stderr/stdout as
error context. Update every similar branch (where you check proc.returncode == 0
or status_code == 200) to: evaluate proc.returncode, proc.stdout, and
proc.stderr, set the result status to "inconclusive" when returncode != 0 or
stdout is empty, and include proc.stderr (and proc.stdout) in the
returned/recorded error message so external failures (auth/rate-limit/API
errors) are not mis-scored as novel. Ensure you apply this change for the
occurrences around the symbols proc.returncode, proc.stdout, proc.stderr and the
analogous HTTP checks (status_code) noted in the comment.

Comment on lines +140 to +143
def test_guide_content_has_7_questions(self) -> None:
content = gap_validate.OBSERVATION_GUIDE_CONTENT
question_count = content.count("**")
assert question_count >= 7
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Question-count assertion is not actually counting questions.

On Line 142, content.count("**") counts bold markers, not question items, so this test can pass even when fewer than 7 questions exist.

Suggested fix
+import re
@@
 class TestObservationGuide:
     def test_guide_content_has_7_questions(self) -> None:
         content = gap_validate.OBSERVATION_GUIDE_CONTENT
-        question_count = content.count("**")
+        question_count = len(
+            re.findall(r"(?m)^\s*(?:\d+\.|[-*])\s+.*\?\s*$", content)
+        )
         assert question_count >= 7
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/scripts/test_gap_validate.py` around lines 140 - 143, The test
test_guide_content_has_7_questions is counting bold markers via
content.count("**") instead of actual questions; update it to scan
OBSERVATION_GUIDE_CONTENT for real question items (e.g., use a regex or
line-based check against patterns like lines ending with '?' and/or lines
beginning with 'Q:' or a numbered question prefix) and count those matches, then
assert the count >= 7; modify the assertion in
test_guide_content_has_7_questions to use that more accurate question-match
logic against OBSERVATION_GUIDE_CONTENT.

@ryanklee
Copy link
Copy Markdown
Collaborator Author

Beta lane merge readiness check (2026-05-26T20:30Z)

All CI checks pass: test, lint, typecheck, security, rust-check, CodeQL (actions/c-cpp/js-ts/python/rust), pr-admission, authority-case-check, freeze-check, homage-visual-regression, web-build, vscode-build, secrets-scan, actionlint, review.

Only failing status: hapax/autoqueue-admission (FAILURE, set at 20:06Z). This is a local commit status, not a GitHub Actions check.

Merge state: UNSTABLE (non-required check failing). Ready for merge queue pending autoqueue-admission resolution.

cc @ryanklee — needs manual merge queue enqueue or autoqueue-admission status investigation.

@ryanklee
Copy link
Copy Markdown
Collaborator Author

Governed stale-PR reconciliation note (task 20260531-stale-pr-reconciliation-after-recovery): leaving this open as a repair candidate, not queue-ready. Current blockers: stale/mismatched task linkage, closed-task closure mismatch, AVSDLC release-gate metadata gap, and the branch is far behind current main. Revival path: create or repair a fresh governed cc-task, rebase onto current main, refresh task/PR linkage, rerun checks, and only then re-admit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant