Severity: High
Several mocks paper over real failure modes and contract details. Tests pass even when production code is silently broken.
1. aevatar-client mock returns { stream: null } — bypasses the entire event pipeline
sisyphus-api/test/runner/runner.test.ts:6-17:
vi.mock('../../src/runner/aevatar-client.js', () => ({
startExecution: vi.fn().mockResolvedValue({ executionId: 'exec-123', stream: null }),
streamEvents: vi.fn().mockResolvedValue(null),
...
}));
stream: null → consumeEventsFromStream early-returns at session-manager.ts:298. The entire event-pipeline (parseSseStream → eventCol.insertOne → broadcastEvent → markCompleted/markFailed) is never executed in any test. A breaking change in AG-UI event shape (e.g. AG renames runFinished → runComplete) would not fail any test.
Fix: add a fixture SSE stream (new ReadableStream emitting data: {"runFinished":{...}}\n); verify event row inserted, session marked completed, error path with runError marks failed.
2. ornn-client + reference-resolver disagree on response shape
sisyphus-api/src/workflows/ornn-client.ts:39-44 reads result.content directly.
sisyphus-api/src/workflows/reference-resolver.ts:89 reads result.data?.files.
The mock in test/workflows/workflow.test.ts:6-18 matches ornn-client's assumption only. If the real ornn returns { data: { content, files: ... } } (which reference-resolver suggests), fetchSkillContent returns empty string in production but tests still pass.
Fix: add a contract test that exercises both code paths against ONE fixture mirroring the real ornn response. Investigate which shape ornn actually returns and align both consumers.
3. dto-mappers casing not asserted
test/workflows/workflow.test.ts:222-231 checks workflowYaml contains "mathematical researcher" but never inspects connectorJson[0] shape. The mainnet DTO uses PascalCase (Name, Http.AllowedMethods, TimeoutMs: 60000, …); a regression to camelCase silently breaks every deploy.
Fix: explicit field-name assertions, or use the dto-mappers unit test from the previous issue. Add an mcp-typed connector test (today 100% coverage is http).
4. The "409 already running" assertion is testing a behavior that doesn't exist
test/runner/runner.test.ts:75-83 claims the second startSession("research", …) returns 409. But WorkflowRunController.start only returns 409 when the message contains "already running" (:83) — and session-manager.startSession doesn't throw with that message. Today the mocked startSession happily creates a duplicate; the 409 is not actually validating duplicate prevention.
Fix: either (a) implement real per-type duplicate prevention (see #28 — TOCTOU bug), or (b) drop the bogus assertion. Don't claim to test what isn't tested.
Severity: High
Several mocks paper over real failure modes and contract details. Tests pass even when production code is silently broken.
1. aevatar-client mock returns
{ stream: null }— bypasses the entire event pipelinesisyphus-api/test/runner/runner.test.ts:6-17:stream: null→consumeEventsFromStreamearly-returns atsession-manager.ts:298. The entire event-pipeline (parseSseStream → eventCol.insertOne → broadcastEvent → markCompleted/markFailed) is never executed in any test. A breaking change in AG-UI event shape (e.g. AG renamesrunFinished→runComplete) would not fail any test.Fix: add a fixture SSE stream (
new ReadableStreamemittingdata: {"runFinished":{...}}\n); verify event row inserted, session marked completed, error path withrunErrormarks failed.2. ornn-client + reference-resolver disagree on response shape
sisyphus-api/src/workflows/ornn-client.ts:39-44readsresult.contentdirectly.sisyphus-api/src/workflows/reference-resolver.ts:89readsresult.data?.files.The mock in
test/workflows/workflow.test.ts:6-18matchesornn-client's assumption only. If the real ornn returns{ data: { content, files: ... } }(whichreference-resolversuggests),fetchSkillContentreturns empty string in production but tests still pass.Fix: add a contract test that exercises both code paths against ONE fixture mirroring the real ornn response. Investigate which shape ornn actually returns and align both consumers.
3. dto-mappers casing not asserted
test/workflows/workflow.test.ts:222-231checksworkflowYamlcontains "mathematical researcher" but never inspectsconnectorJson[0]shape. The mainnet DTO uses PascalCase (Name,Http.AllowedMethods,TimeoutMs: 60000, …); a regression to camelCase silently breaks every deploy.Fix: explicit field-name assertions, or use the dto-mappers unit test from the previous issue. Add an
mcp-typed connector test (today 100% coverage ishttp).4. The "409 already running" assertion is testing a behavior that doesn't exist
test/runner/runner.test.ts:75-83claims the secondstartSession("research", …)returns 409. ButWorkflowRunController.startonly returns 409 when the message contains"already running"(:83) — andsession-manager.startSessiondoesn't throw with that message. Today the mockedstartSessionhappily creates a duplicate; the 409 is not actually validating duplicate prevention.Fix: either (a) implement real per-type duplicate prevention (see #28 — TOCTOU bug), or (b) drop the bogus assertion. Don't claim to test what isn't tested.