Reliability hardening: self-test harness, integration tests, SLA budgets, diagnostics by Copilot · Pull Request #17 · DaScient-Intelligence/Plan-Examiner

Copilot · 2026-05-06T02:51:14Z

Summary

Closes the gap between "unit tests pass" and "I can prove the scanner is functioning on demand." Adds an end-to-end self-test runnable from the UI and CI, pipeline contracts/SLAs, result versioning, and operational tooling.

Self-test harness — assets/js/agent/self-test.js (PE.SelfTest.run()) + 3 bundled DXF fixtures (clean / non-compliant / sparse) and a golden expected.json (fact keys, must-flag/must-not-flag rule ids, count bands, score band).
Integration test — tests/integration.pipeline.test.js drives real PE.Extractors → selectPacks → RuleEngine → score, no mocks. Asserts the non-compliant fixture actually produces FLAGGED findings (door-width / egress).
Pipeline contracts — _assertExtractionShape / _assertFindingsShape throw structured PIPELINE_CONTRACT errors instead of corrupting downstream state.
SLA budgets — PE.Pipeline.SLA per-step (ingest 15s / extract 8s / evaluate 2s / total 30s) with WARN on breach; enforced by tests/pipeline.perf.test.js.
Result versioning — every result is stamped with runId, engineVersion, rulesVersion (FNV-1a fingerprint of pack id@version:ruleCount tuples), startedAt/completedAt/totalMs.
Rule-engine memoization — _buildContext cached per evaluate() call (reset at call boundary) so derived occupancy/sprinklered/construction context isn't recomputed per rule.
Crash resilience — tests/extractors.crash.test.js covers empty/garbage/non-DXF input, unknown check_fn, null parameters; rule engine is asserted to never emit a status outside {PASS, REVIEW, FLAGGED}.
Diagnostic bundle — PE.Log.exportRun(runId) + entriesForRun() produce a schema-versioned, redacted JSON bundle for support tickets.
In-app Diagnostics button — added to the AI Settings modal; shows per-fixture pass/fail, score, and counts. Same code path as headless CI.
Schema gate in npm test — tests/rules.schema.test.js invokes scripts/validate-rules.mjs.
Rules inventory — scripts/rules-report.mjs regenerates docs/RULES.md (23 packs / 153 rules); CI fails if stale (--check).
Health check — scripts/status.mjs runs the same selftest in Node and writes status.json; .github/workflows/health.yml runs it daily.
Service worker — bumped to plan-examiner-v5; pre-cache extended from 3 packs to all 22 active packs + selftest fixtures + self-test.js.
npm scripts — validate:rules, rules:report, rules:report:check, status.

// Anyone can verify the scanner end-to-end, no dev tools required:
const { summary, results } = await PE.SelfTest.run();
// summary: { ok, total, passed, failed, durationMs, manifestVersion, ... }
// results[i]: { id, ok, score, counts:{PASS,REVIEW,FLAGGED}, flagged:[...], assertions:[...] }

Out of scope (deferred per the original plan's "ship last"): PDF.js / Tesseract worker-ization. Per-pack golden .test.js files are subsumed by the cross-pack selftest fixtures, which exercise all 22 packs / 153 rules per run.

Type of Change

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change that adds functionality)
New rule pack (new jurisdiction or code edition)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation update
Accessibility improvement

Rule Pack Changes (if applicable)

Rule pack file: n/a — no rule packs modified
Jurisdiction / Code edition: n/a
Number of rules added/modified: 0 (auto-generated docs/RULES.md reflects current inventory)

Testing

Tested manually in Chrome/Firefox/Safari
Tested with a sample PDF plan
Tested with a sample DXF plan
Tested with a sample DOCX plan
Rule pack JSON validates (run python3 -c "import json; json.load(open('assets/data/rules/your-pack.json'))")

npm test: 96/96 passing (was 79; +17 new). node scripts/validate-rules.mjs: 0 errors. node scripts/status.mjs: 3/3 fixtures pass.

Checklist

My code follows the existing code style
I have performed a self-review of my changes
No new API keys, secrets, or credentials are committed
Relevant documentation has been updated

Screenshots (if applicable)

The new "Run Diagnostics" control lives in the AI Settings modal below "Test connection"; output is rendered in a monospace results panel that turns green on full pass / red on any failure.

Related Issues

…ontracts Agent-Logs-Url: https://github.com/DaScient-Intelligence/Plan-Examiner/sessions/594db8e5-858f-4cf8-ba13-0eb0a71df477 Co-authored-by: DaScient <25983786+DaScient@users.noreply.github.com>

…atus, health workflow Agent-Logs-Url: https://github.com/DaScient-Intelligence/Plan-Examiner/sessions/594db8e5-858f-4cf8-ba13-0eb0a71df477 Co-authored-by: DaScient <25983786+DaScient@users.noreply.github.com>

Copilot

Pull request overview

This PR adds an end-to-end self-test/diagnostics harness and supporting CI/ops tooling to prove the scanner pipeline (extract → select packs → evaluate → score) is functioning deterministically on demand (in-app, in CI, and via scheduled health checks).

Changes:

Adds PE.SelfTest plus bundled DXF fixtures + golden expectations, and integrates it into the UI (“Run Diagnostics”) and Node (scripts/status.mjs).
Introduces new integration/perf/crash-resilience tests and CI gates (rules schema validation + rules inventory freshness).
Hardens pipeline observability with SLA budgets, structured pipeline contracts, result stamping/versioning, and diagnostic bundle export helpers.

Reviewed changes

Copilot reviewed 21 out of 22 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
tests/rules.schema.test.js	Adds a Node test that gates `scripts/validate-rules.mjs` via `npm test`.
tests/pipeline.perf.test.js	Enforces SLA-style performance ceilings using self-test fixtures + targeted checks.
tests/integration.pipeline.test.js	True end-to-end integration test wiring real extractors → rule engine → score against fixtures.
tests/extractors.crash.test.js	Adds crash-resilience matrix for extractors and rule engine output domain.
sw.js	Bumps service worker cache version and expands pre-cache list to include self-test + more packs.
scripts/status.mjs	Adds a headless health check that runs `PE.SelfTest` and writes `status.json`.
scripts/rules-report.mjs	Generates/validates `docs/RULES.md` rule-pack inventory (CI freshness check).
package.json	Adds new scripts: validate rules, generate/check rules report, and status.
index.html	Adds “Run Diagnostics” UI and loads `self-test.js`.
docs/RULES.md	Adds auto-generated rule pack inventory documentation.
assets/js/utils/log.js	Adds `entriesForRun()` and `exportRun()` for run-scoped diagnostic bundles.
assets/js/app.js	Wires the in-app “Run Diagnostics” button to `PE.SelfTest.run()`.
assets/js/agent/self-test.js	Implements the self-test harness runnable in browser + Node.
assets/js/agent/rule-engine.js	Adds per-evaluate context memoization to reduce redundant context derivation.
assets/js/agent/pipeline.js	Adds SLA budgets, pipeline contract assertions, result stamping/versioning, and rules fingerprinting.
assets/data/fixtures/selftest/clean-office.dxf	Adds compliant self-test DXF fixture.
assets/data/fixtures/selftest/non-compliant-assembly.dxf	Adds non-compliant DXF fixture to prove FLAGGED findings occur.
assets/data/fixtures/selftest/sparse-warehouse.dxf	Adds sparse DXF fixture to prove missing evidence yields REVIEW outcomes.
assets/data/fixtures/selftest/expected.json	Adds golden expectations for fixtures (bands, must-flag, must-not-flag, etc.).
.gitignore	Ignores generated `status.json`.
.github/workflows/health.yml	Adds scheduled workflow running `scripts/status.mjs` and uploading `status.json`.
.github/workflows/ci.yml	Adds CI step to ensure `docs/RULES.md` is up-to-date (`--check`).

Comments suppressed due to low confidence (1)

sw.js:12

STATIC_ASSETS uses absolute (leading '/') URLs. On GitHub Pages project sites (served under '//'), these requests resolve to the origin root (e.g. '/assets/...') and will 404, causing pre-cache to fail and offline mode to be ineffective. Consider generating URLs relative to the service worker scope (e.g. omit the leading '/', or build with new URL('assets/...', self.registration.scope)).

var STATIC_ASSETS = [
  '/',
  '/index.html',
  '/assets/css/styles.css',
  '/assets/js/app.js',
  '/assets/js/agent/rule-engine.js',

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+   * No raw file contents, no API keys (already redacted by _redactString
+   * applied at record time). Safe to share.


    sT0 = _now();
    emit('select', 'running', 'Loading rule packs for ' + (formData.buildingCode || 'IBC 2021') + '…');
    _stepLog('select', 'running', 'loading packs', { buildingCode: formData.buildingCode });
    var packs;
    try {
      packs = await selectPacks(formData.buildingCode || '2024 IBC', facts.buildingType);
      result.packs = packs;


+    var result = {
+      facts: {}, packs: [], findings: [], score: 0, summary: '', correctionLetter: '',
+      runId: _newRunId(),
+      engineVersion: ENGINE_VERSION,
+      rulesVersion: null,         // filled in once packs are selected
+      startedAt: new Date().toISOString()


+lines.push('<!-- Auto-generated by scripts/rules-report.mjs. Do not edit by hand. -->');
+lines.push('# Plan-Examiner Rule-Pack Inventory');
+lines.push('');
+lines.push(`Generated: ${new Date().toISOString().slice(0, 10)}  `);


Copilot AI and others added 2 commits May 6, 2026 02:42

Add self-test harness, integration tests, SLA budgets, and pipeline c…

5c2bc61

…ontracts Agent-Logs-Url: https://github.com/DaScient-Intelligence/Plan-Examiner/sessions/594db8e5-858f-4cf8-ba13-0eb0a71df477 Co-authored-by: DaScient <25983786+DaScient@users.noreply.github.com>

Add perf/crash tests, diagnostics UI, log.exportRun, rules-report, st…

000f750

…atus, health workflow Agent-Logs-Url: https://github.com/DaScient-Intelligence/Plan-Examiner/sessions/594db8e5-858f-4cf8-ba13-0eb0a71df477 Co-authored-by: DaScient <25983786+DaScient@users.noreply.github.com>

Copilot AI assigned Copilot and DaScient May 6, 2026

Copilot created this pull request from a session on behalf of DaScient May 6, 2026 02:51 View session

DaScient marked this pull request as ready for review May 6, 2026 02:52

Copilot AI review requested due to automatic review settings May 6, 2026 02:52

DaScient merged commit 4cc4ea6 into main May 6, 2026
9 checks passed

Copilot started reviewing on behalf of DaScient May 6, 2026 02:52 View session

Copilot AI reviewed May 6, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reliability hardening: self-test harness, integration tests, SLA budgets, diagnostics#17

Reliability hardening: self-test harness, integration tests, SLA budgets, diagnostics#17
DaScient merged 2 commits into
mainfrom
copilot/ensure-document-scanning-functionality

Copilot AI commented May 6, 2026 •

edited by DaScient

Loading

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		* No raw file contents, no API keys (already redacted by _redactString
		* applied at record time). Safe to share.

Conversation

Copilot AI commented May 6, 2026 • edited by DaScient Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Type of Change

Rule Pack Changes (if applicable)

Testing

Checklist

Screenshots (if applicable)

Related Issues

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented May 6, 2026 •

edited by DaScient

Loading