Summary
When resuming a workflow with the web dashboard (conductor resume <workflow.yaml> --web or --web-bg), the dashboard starts fresh: the timeline / agent panels / activity stream show nothing about agents that completed before the checkpoint. The user effectively loses all visual context of what the workflow already did.
The execution itself is correct — WorkflowContext (agent outputs, execution history) is restored in the engine and downstream agents do receive prior outputs — but the dashboard UI is blank for everything that ran before the checkpoint.
Reproduction
conductor run examples/some-workflow.yaml --web and let several agents complete.
- Trigger a failure (or stop mid-run) so a checkpoint is saved.
conductor resume examples/some-workflow.yaml --web.
- Open the dashboard.
Expected
The dashboard should show prior completed agents (status, outputs, timestamps, messages) so the user has the full visual context of the run, with execution continuing live from the resumed agent.
Actual
The dashboard shows the static workflow graph, but every node/panel for previously completed agents is empty. Only events from the resumed agent forward appear.
Root cause
WebDashboard accumulates its state purely from live events on the WorkflowEventEmitter (see src/conductor/web/server.py):
self._event_history: list[dict[str, Any]] = []
...
self._emitter.subscribe(self._on_event)
/api/state and the late-joiner WebSocket replay both serve from self._event_history. When the dashboard is started during resume_workflow_async (src/conductor/cli/run.py), no historical events are ever fed into it.
The behaviour is acknowledged in AGENTS.md:
Note: on resume, the dashboard only shows events from the resumed agent forward — events from agents that completed before the checkpoint were emitted in the original process and are not replayed.
…and in the docstring of resume_workflow_async:
the dashboard only shows events from the resumed agent forward; agent runs that completed before the checkpoint are not replayed.
The checkpoint does fully preserve workflow state (context.agent_outputs, context.execution_history, copilot_session_ids) but does not record the original run_id or JSONL event log path, so today there isn't even a way to find the original event log to replay from.
Suggested fix
Two reasonable options, not mutually exclusive:
Option A — replay events from the original JSONL log (preferred when available)
- Add
event_log_path (and run_id) to CheckpointData / save_checkpoint so the checkpoint knows where the original *.events.jsonl lives. (The path is already available via EventLogSubscriber.path at the time the checkpoint is written.)
- On resume, before subscribing the dashboard to the emitter, read the JSONL line-by-line and call
dashboard._on_event (or expose a replay_events() method) for each event. New live events from the resumed run are appended after the historical ones.
- Keep the workflow-level
run_id stable across resumes so timeline / log-correlation tools see one continuous run rather than a fresh one.
This gives the user the full original timeline (messages, tool calls, reasoning, etc.) — not just status.
Option B — synthesize summary events from the restored context (fallback)
If no event log file is available (older checkpoints, log file deleted, etc.), synthesize minimal events from restored_context.execution_history + restored_context.agent_outputs so each prior agent at least appears as completed in the dashboard with its final output, even without the intermediate streaming events.
Implementation sketch:
# In resume_workflow_async, after creating the dashboard but before
# the engine starts emitting new events:
if dashboard is not None:
if cp.event_log_path and Path(cp.event_log_path).exists():
dashboard.replay_events_from_jsonl(Path(cp.event_log_path))
else:
dashboard.replay_synthetic_from_context(restored_context, config)
replay_events_from_jsonl would just iterate JSON lines and call the same _on_event handler the emitter uses today, so every existing rendering path on the frontend works unchanged (history is already broadcast to late-joining WebSocket clients via /api/state).
Additional polish
- Drop a banner in the dashboard header indicating "Resumed from checkpoint at " so users understand why earlier events have a gap or differ in style.
- Make the same data available on the
run side too — --web-bg clients reconnecting to a still-running process already get history via /api/state; this brings parity for the resume case.
Workaround
For now, users can open the original JSONL event log directly (look in $TMPDIR/conductor/conductor-<workflow>-<timestamp>.events.jsonl) but there is no automatic correlation between a checkpoint and its source log file.
Related
src/conductor/cli/run.py — resume_workflow_async (dashboard init)
src/conductor/web/server.py — WebDashboard._event_history / /api/state
src/conductor/engine/checkpoint.py — CheckpointData schema
src/conductor/engine/event_log.py — EventLogSubscriber (source of replayable events)
AGENTS.md — "Run / Resume Parity" section (last bullet documents the gap)
Summary
When resuming a workflow with the web dashboard (
conductor resume <workflow.yaml> --webor--web-bg), the dashboard starts fresh: the timeline / agent panels / activity stream show nothing about agents that completed before the checkpoint. The user effectively loses all visual context of what the workflow already did.The execution itself is correct —
WorkflowContext(agent outputs, execution history) is restored in the engine and downstream agents do receive prior outputs — but the dashboard UI is blank for everything that ran before the checkpoint.Reproduction
conductor run examples/some-workflow.yaml --weband let several agents complete.conductor resume examples/some-workflow.yaml --web.Expected
The dashboard should show prior completed agents (status, outputs, timestamps, messages) so the user has the full visual context of the run, with execution continuing live from the resumed agent.
Actual
The dashboard shows the static workflow graph, but every node/panel for previously completed agents is empty. Only events from the resumed agent forward appear.
Root cause
WebDashboardaccumulates its state purely from live events on theWorkflowEventEmitter(seesrc/conductor/web/server.py):/api/stateand the late-joiner WebSocket replay both serve fromself._event_history. When the dashboard is started duringresume_workflow_async(src/conductor/cli/run.py), no historical events are ever fed into it.The behaviour is acknowledged in
AGENTS.md:…and in the docstring of
resume_workflow_async:The checkpoint does fully preserve workflow state (
context.agent_outputs,context.execution_history,copilot_session_ids) but does not record the originalrun_idor JSONL event log path, so today there isn't even a way to find the original event log to replay from.Suggested fix
Two reasonable options, not mutually exclusive:
Option A — replay events from the original JSONL log (preferred when available)
event_log_path(andrun_id) toCheckpointData/save_checkpointso the checkpoint knows where the original*.events.jsonllives. (The path is already available viaEventLogSubscriber.pathat the time the checkpoint is written.)dashboard._on_event(or expose areplay_events()method) for each event. New live events from the resumed run are appended after the historical ones.run_idstable across resumes so timeline / log-correlation tools see one continuous run rather than a fresh one.This gives the user the full original timeline (messages, tool calls, reasoning, etc.) — not just status.
Option B — synthesize summary events from the restored context (fallback)
If no event log file is available (older checkpoints, log file deleted, etc.), synthesize minimal events from
restored_context.execution_history+restored_context.agent_outputsso each prior agent at least appears as completed in the dashboard with its final output, even without the intermediate streaming events.Implementation sketch:
replay_events_from_jsonlwould just iterate JSON lines and call the same_on_eventhandler the emitter uses today, so every existing rendering path on the frontend works unchanged (history is already broadcast to late-joining WebSocket clients via/api/state).Additional polish
runside too —--web-bgclients reconnecting to a still-running process already get history via/api/state; this brings parity for the resume case.Workaround
For now, users can open the original JSONL event log directly (look in
$TMPDIR/conductor/conductor-<workflow>-<timestamp>.events.jsonl) but there is no automatic correlation between a checkpoint and its source log file.Related
src/conductor/cli/run.py—resume_workflow_async(dashboard init)src/conductor/web/server.py—WebDashboard._event_history//api/statesrc/conductor/engine/checkpoint.py—CheckpointDataschemasrc/conductor/engine/event_log.py—EventLogSubscriber(source of replayable events)AGENTS.md— "Run / Resume Parity" section (last bullet documents the gap)