Situation
run_claude() in tools/knowledge-creator/scripts/common.py only saves execution logs (.json, .out.json, .ndjson) when subprocess.run() returns returncode=0. On error, only generate/{file_id}.json with {"status": "error", "error": ""} is recorded.
The .in.txt is saved before the CC call (by design, as a crash-safety measure), but nothing after.
Pain
When a CC execution fails, there is no log to diagnose:
- What returncode was returned
- What stderr contained
- Whether any partial NDJSON stream was produced
This was discovered during nabledge-1.4 generation (#122), where 3 files failed with empty error messages and no execution logs, making root cause analysis difficult. All that could be determined was timing (from execution.log) and that the CC call took ~31-34 minutes before failing.
Benefit
- Developers can inspect returncode and stderr for any failed CC execution
- Partial NDJSON output (if any) is preserved for debugging
- Error message in
generate/{file_id}.json includes actual cause instead of ""
Success Criteria
Implementation Sketch
On error path (returncode != 0), save a log and return a descriptive error:
with open(f"{base}.json", 'w', encoding='utf-8') as f:
json.dump({
"file_id": file_id,
"timestamp": timestamp,
"returncode": result.returncode,
"stderr": result.stderr[:2000],
"stdout_len": len(result.stdout),
}, f, ensure_ascii=False, indent=2)
if result.stdout.strip():
with open(f"{base}.ndjson", 'w', encoding='utf-8') as f:
f.write(result.stdout)
return subprocess.CompletedProcess(
args=result.args, returncode=result.returncode,
stdout="", stderr=result.stderr or f"CC exited with returncode={result.returncode}"
)
Related
Situation
run_claude()intools/knowledge-creator/scripts/common.pyonly saves execution logs (.json,.out.json,.ndjson) whensubprocess.run()returnsreturncode=0. On error, onlygenerate/{file_id}.jsonwith{"status": "error", "error": ""}is recorded.The
.in.txtis saved before the CC call (by design, as a crash-safety measure), but nothing after.Pain
When a CC execution fails, there is no log to diagnose:
This was discovered during nabledge-1.4 generation (#122), where 3 files failed with empty error messages and no execution logs, making root cause analysis difficult. All that could be determined was timing (from
execution.log) and that the CC call took ~31-34 minutes before failing.Benefit
generate/{file_id}.jsonincludes actual cause instead of""Success Criteria
run_claude()saves a.jsonlog with:returncode,stderr(truncated if large), stdout length, timestamp.ndjsonfor debugginggenerate/{file_id}.jsonerror field is populated with the actual stderr or a descriptive message (e.g."timeout or crash: returncode=1, stderr=...")subprocess.run()timeout is set (e.g. 2400s) so Python-levelTimeoutExpiredis raised and logged with message"subprocess timeout after Xs"Implementation Sketch
On error path (returncode != 0), save a log and return a descriptive error:
Related
.pr/00224/ungenerated-3-investigation.mdfor full analysis