Skip to content

refactor(daily-actor): split SkillRunnerGAgent into DailyReportSubscriptionGAgent + DailyReportRunGAgent #447

@eanzhao

Description

@eanzhao

Architectural follow-up surfaced in docs/audit-scorecard/2026-04-27-daily-pipeline-architecture-review.md §B4.

Symptom

SkillRunnerGAgent is named after a technical role ("the thing that runs skills on a schedule") and carries two unrelated lifetimes in one actor:

  • Long-lived subscription statecron, target, GitHub binding, skill_content
  • Per-execution session statelast_run_at, last_output, error_count, retry_attempt

It is also a polymorphic actor: template_name decides whether "this" is a daily report or a social-media drafter, and the actual business semantics live entirely inside the frozen skill_content LLM prompt string. Anything expressible as a prompt becomes an "agent".

Concrete consequence: ScheduleRetryAsync and ScheduleNextRunAsync share ChannelScheduleRunner + retry lease machinery — they can collide. Failure semantics from one execution pollute the long-lived subscription state. issue #439 (silent failure) is hard to fix cleanly because the runner has no "this run" boundary to fail without contaminating "the subscription".

Architectural violations

  • CLAUDE.md "Actor 以业务命名... 禁止 WriteActor / ReadModelActor / StoreActor 等技术功能命名".
  • CLAUDE.md "默认短生命周期: 一次执行/会话/编排即完成的能力,建模为 run/session/task-scoped actor; 长期 actor 限定事实拥有者".
  • CLAUDE.md "Actor 即业务实体: 一个 actor = 一个业务实体" — "skill runner" is not a business entity; "the user's daily GitHub report subscription" is.

Proposed direction

Split into two actors per template:

  • DailyReportSubscriptionGAgent (long-lived, fact owner)

    • State: cron, timezone, owner identity, GitHub username, delivery target reference, prompt template id+version (#refactor-prompt-template), enabled
    • Commands: CreateSubscription, Disable, Enable, UpdateSchedule, Delete
    • Events: SubscriptionInitialized, SubscriptionDisabled, SubscriptionEnabled, NextRunScheduled, SubscriptionTombstoned
    • Does NOT execute LLM / GitHub calls itself; spawns a run actor per scheduled fire.
  • DailyReportRunGAgent (session-scoped, one per execution)

    • State: started_at, completed_at, output, errors[], retry attempts, source subscription_id + scheduled_at
    • Commands: StartRun, internal continuations
    • Events: RunStarted, RunStepCompleted, ToolCallFailed, RunCompleted, RunFailed
    • Owns retry semantics in full — subscription doesn't know what "retry" is.
    • Naturally discardable / archivable; readmodel for run history is its own projection.

Same pattern for social_media: SocialMediaPostSubscriptionGAgent + SocialMediaPostRunGAgent. The generic skill_runner abstraction disappears.

Cross-cutting wins:

Acceptance

  • No actor named SkillRunnerGAgent in the codebase. Same for the proto.
  • DailyReportSubscriptionGAgent state contains zero per-execution fields.
  • Each daily run is one DailyReportRunGAgent activation; on retry the same run actor resumes (not a new one).
  • Existing /daily, /run-agent, /disable-agent, /enable-agent, /delete-agent commands all keep working with new actor names underneath.
  • Migration path defined: existing SkillRunnerGAgent instances convert to DailyReportSubscriptionGAgent (or SocialMediaPostSubscriptionGAgent) on next activation, OR a one-shot replay tool produces the equivalent snapshot.

Dependencies

  • #refactor-credential-actor — credential actor referenced by subscription.
  • #refactor-prompt-template — versioned prompt referenced by subscription.

Related

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions