Skip to content

feat(studio): member-first authority + bind orchestration (#325)#428

Merged
eanzhao merged 7 commits intodevfrom
feat/2026-04-27_studio-member-first-apis
Apr 27, 2026
Merged

feat(studio): member-first authority + bind orchestration (#325)#428
eanzhao merged 7 commits intodevfrom
feat/2026-04-27_studio-member-first-apis

Conversation

@eanzhao
Copy link
Copy Markdown
Contributor

@eanzhao eanzhao commented Apr 27, 2026

Closes #325.

Summary

  • Introduces a real member backend authority (StudioMemberGAgent) so Studio binds workflows / scripts / gagents to a member's own stable publishedServiceId instead of falling back to the scope-default service.
  • publishedServiceId is generated once at create time from the immutable memberId and persisted on the actor — never recomputed on read, never mutated by rename.
  • New member-first HTTP surface under /api/scopes/{scopeId}/members[...]. ServiceId is no longer a user-facing input.
  • Bind orchestration delegates to the existing IScopeBindingCommandPort with ServiceId = publishedServiceId, then records the resulting revision back on the member authority — no second binding system.

Acceptance criteria from #325

  • First-class member model with persistent memberId and publishedServiceId.
  • Creating a member supports workflow / script / gagent.
  • publishedServiceId is backend-generated, stable, and rename-safe (locked in by StudioMemberGAgentStateTests + StudioMemberConventionsTests).
  • Studio can list members from backend without inferring them from workflow files or services.
  • Member binding publishes to the member's own stable published service, not the scope default service (locked in by StudioMemberServiceBindingTests).
  • Binding supports workflow, script, and gagent members.
  • serviceId is not a user-facing input on member endpoints.
  • Legacy scope-first endpoints under Aevatar.GAgentService.Hosting remain untouched and continue to work.

What changed

agents/Aevatar.GAgents.StudioMember/ (new)

  • studio_member_messages.proto — typed state, events, implementation refs (no generic bag).
  • StudioMemberGAgent.cs — actor that owns the canonical authority state.
  • StudioMemberConventions.cs — actor id (studio-member:{scopeId}:{memberId}) + rename-safe publishedServiceId (member-{memberId}).

src/Aevatar.Studio.Projection

  • ReadModels/studio_projection_readmodels.proto — adds StudioMemberCurrentStateDocument with denormalized roster fields.
  • Projectors/StudioMemberCurrentStateProjector.cs — materializes committed events into the read model.
  • QueryPorts/ProjectionStudioMemberQueryPort.cs — pure-read query port; roster scans constrained to scope_id.
  • CommandServices/ActorDispatchStudioMemberCommandService.cs — write-side command port that dispatches via IStudioActorBootstrap + IActorDispatchPort.

src/Aevatar.Studio.Application

  • Studio/Abstractions/IStudioMemberCommandPort.cs, IStudioMemberQueryPort.cs, IStudioMemberService.cs.
  • Studio/Contracts/MemberContracts.cs + MemberImplementationKindMapper.cs — wire ↔ proto mapping.
  • Studio/Services/StudioMemberService.cs — bind orchestration (resolves member → builds ScopeBindingUpsertRequest with ServiceId = publishedServiceId → delegates to IScopeBindingCommandPort → records revision back on member authority).

src/Aevatar.Studio.Hosting

  • Endpoints/StudioMemberEndpoints.csPOST/GET/PUT under /api/scopes/{scopeId}/members[...].
  • DI registrations + StudioMemberCurrentStateDocument document store provider hooks (Elasticsearch + InMemory).

test/Aevatar.Studio.Tests (+19 new tests)

  • StudioMemberConventionsTests — actor-id layout, rename-safety of publishedServiceId.
  • StudioMemberImplementationKindMapperTests — wire ↔ proto mapping.
  • StudioMemberGAgentStateTests — state transitions including rename-safety invariant.
  • StudioMemberServiceBindingTests — bind orchestration: ServiceId is always the member's publishedServiceId, revision recorded back, all three impl kinds supported.

Test plan

  • dotnet build aevatar.slnx --nologo — 0 errors.
  • dotnet test test/Aevatar.Studio.Tests/Aevatar.Studio.Tests.csproj — 235 passed (no regressions).
  • bash tools/ci/projection_state_version_guard.sh
  • bash tools/ci/projection_state_mirror_current_state_guard.sh
  • bash tools/ci/projection_route_mapping_guard.sh
  • bash tools/ci/query_projection_priming_guard.sh
  • bash tools/ci/test_stability_guards.sh
  • bash tools/ci/workflow_binding_boundary_guard.sh
  • bash tools/ci/proto_lint_guard.sh
  • bash tools/ci/committed_state_projection_guard.sh
  • bash tools/ci/cqrs_eventsourcing_boundary_guard.sh

Non-goals (matches issue)

  • No frontend work in this PR.
  • Existing scope-first endpoints (PUT /api/scopes/{scopeId}/binding etc.) stay untouched for legacy / runtime callers.
  • No team-router redesign.

🤖 Generated with Claude Code

…325)

Issue: #325

Adds a real `member` backend model so Studio binds workflows / scripts /
gagents to a member's own stable `publishedServiceId` instead of falling
back to the scope-default service.

Authority + read model
- New `StudioMemberGAgent` (in `agents/Aevatar.GAgents.StudioMember`)
  owns canonical member state (`memberId`, `scopeId`, `displayName`,
  `description`, `implementationKind`, `implementationRef`,
  `publishedServiceId`, `lifecycleStage`, `lastBinding`).
- `publishedServiceId` is generated once at create time from the
  immutable `memberId` and persisted on the actor — never recomputed on
  read, never mutated by rename.
- New `StudioMemberCurrentStateDocument` projector + query port; roster
  scans constrained to the requested `scope_id`. Rendered through the
  same `StudioMaterializationContext` as other Studio current-state
  projectors — no second materialization pipeline.

Member-first HTTP surface (under `/api/scopes/{scopeId}`)
- `POST   /members`
- `GET    /members`
- `GET    /members/{memberId}`
- `PUT    /members/{memberId}/binding`
- `GET    /members/{memberId}/binding`
- Endpoints depend only on `IStudioMemberService`; ServiceId is never a
  user-facing input.

Bind orchestration
- `StudioMemberService.BindAsync` resolves the member, builds a
  `ScopeBindingUpsertRequest` with `ServiceId = publishedServiceId`,
  delegates to the existing `IScopeBindingCommandPort`, and records the
  resulting revision back on the member authority. Reuses workflow /
  script / gagent binding paths — no second binding system.

Tests (235 Studio tests pass, +19 new)
- `StudioMemberConventionsTests` — actor-id layout, rename-safety of
  `publishedServiceId`.
- `StudioMemberImplementationKindMapperTests` — wire ↔ proto mapping.
- `StudioMemberGAgentStateTests` — state transitions including
  rename-safety invariant.
- `StudioMemberServiceBindingTests` — bind orchestration locks in:
  ServiceId is always the member's `publishedServiceId`, the resulting
  revision is recorded back on the member authority, and workflow /
  script / gagent each route through the same orchestration.

Validation
- `dotnet build aevatar.slnx`: 0 errors.
- `dotnet test test/Aevatar.Studio.Tests`: 235 passed.
- Targeted CI guards (`projection_state_version_guard`,
  `projection_state_mirror_current_state_guard`,
  `projection_route_mapping_guard`, `query_projection_priming_guard`,
  `test_stability_guards`, `workflow_binding_boundary_guard`,
  `proto_lint_guard`, `committed_state_projection_guard`,
  `cqrs_eventsourcing_boundary_guard`): all pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 95c96e5d85

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +123 to +126
if (next.LifecycleStage == StudioMemberLifecycleStage.Created
&& HasResolvedImplementationRef(evt.ImplementationRef))
{
next.LifecycleStage = StudioMemberLifecycleStage.BuildReady;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Downgrade lifecycle after implementation updates post-bind

This transition only upgrades created -> build_ready, so if a member is already bind_ready and its implementation is changed, the state stays bind_ready and still looks invokable even though the new implementation has not been rebound. In that scenario, the read model can advertise readiness while last_binding still points at an older revision, which can drive callers to invoke stale behavior.

Useful? React with 👍 / 👎.

Comment on lines +30 to +34
if (!string.Equals(State.MemberId, evt.MemberId, StringComparison.Ordinal))
{
throw new InvalidOperationException(
$"member already initialized with id '{State.MemberId}'.");
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reject conflicting re-create payloads for existing member IDs

The duplicate-create guard only compares member_id; any re-create with the same ID but different display name/description/implementation silently passes this check and is treated as a no-op. That means conflicting create requests can be accepted without surfacing an error, leaving persisted authority state unchanged while callers believe their new payload was applied.

Useful? React with 👍 / 👎.

Comment on lines +45 to +46
Take = DefaultRosterPageSize,
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Expose pagination or fetch all pages in member roster queries

The roster query is hard-capped to 200 documents, but StudioMemberRosterResponse has no cursor/continuation token and this method does not iterate NextCursor. For scopes with more than 200 members, entries beyond the first page are silently omitted, so clients cannot enumerate the full authoritative member set via the list endpoint.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

@eanzhao eanzhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review summary

The shape of the PR is solid — typed proto contracts, dedicated actor, dedicated projector, denormalized roster fields, scope-bound query. The unit tests around publishedServiceId rename-safety and bind orchestration are clean and on-target.

However I see one dead injection, one lifecycle-model gap that contradicts issue #325, and several smaller concerns. Detailed inline comments below; the most important ones grouped here:

1. implementation_ref is modeled but never populated, and bind_ready is unreachable through build_ready

Issue #325 explicitly lists implementationRef as a required field on the member model, and frames the lifecycle as build → bind:

Bind should mean: publish the current member revision to that member's service.

In this PR:

  • StudioMemberState.implementation_ref is on the proto.
  • StudioMemberImplementationUpdatedEvent and HandleImplementationUpdated exist.
  • IStudioMemberCommandPort.UpdateImplementationAsync and ActorDispatchStudioMemberCommandService.UpdateImplementationAsync exist.

But no service / HTTP path ever calls UpdateImplementationAsync. StudioMemberService.BindAsync accepts the workflow yaml / script id / actor type in the bind request body and forwards it directly to IScopeBindingCommandPort — without first persisting it on the member authority via UpdateImplementationAsync.

Consequences:

  • member.implementation_ref is always empty. The denormalized state_root inside StudioMemberCurrentStateDocument reflects this — the read model has no idea what the member is implemented by.
  • LifecycleStage skips BuildReady entirely: ApplyBound sets BindReady regardless of current stage, so members go Created → BindReady directly. The three-stage lifecycle in the proto and in the spec is collapsed to two stages by the actual flow.
  • The bind flow violates the issue’s mental model: it conflates Build (declare the implementation revision) and Bind (publish it to my service). The correct model is that bind reads the current revision off the member, not from the request body.

This is a scope/architecture gap, not a small polish issue. Two ways to close it:

  • Add a separate PUT /members/{memberId}/implementation (or fold it into POST /members + later updates) that calls UpdateImplementationAsync, and have BindAsync use the persisted implementation_ref instead of the request body. This matches issue #325.
  • Or, keep the current single-shot bind but include a RecordImplementationAsync step inside BindAsync so the member state is consistent with the published revision after bind. Less clean, but at least the actor stops lying.

Either way, today the actor’s implementation_ref field and the BuildReady stage are dead surface that the read model still reports.

2. Dead IStudioMemberQueryPort dependency in the command service

ActorDispatchStudioMemberCommandService injects IStudioMemberQueryPort _queryPort (assigned in the ctor) but never reads it. Per CLAUDE.md “删除优先:空转发、重复抽象、无业务价值代码直接删除,不保留兼容空壳” — drop the field + ctor parameter, or actually use it (e.g. for the duplicate-create check).

3. UpdateImplementationAsync and StudioMemberRenamedEvent are unused public surface

Same “删除优先” point. UpdateImplementationAsync is on the command port + implementation but no caller. StudioMemberRenamedEvent is handled by the actor and projector but no service / endpoint emits it. If they’re meant for the next PR, fine — but right now they are uncovered surface that drift if not exercised. Either wire them in this PR (preferred — see point 1) or drop them and add later when needed.

4. state_root Any-pack of full actor state in the read-model document

StudioMemberCurrentStateDocument.state_root: Any packs StudioMemberState directly, and ProjectionStudioMemberQueryPort.GetAsync unpacks it to read implementation_ref and last_binding. CLAUDE.md is explicit:

状态镜像契约面向查询:state mirror payload 作为 readmodel 输入时须是面向读侧的稳定强类型契约,非 actor 内部 state 的原样 dump.

UserConfigCurrentStateDocument (also Studio) is the right pattern — fully denormalized, no state_root. The new doc here should denormalize implementation_ref (e.g. implementation_workflow_id, implementation_script_id, implementation_actor_type_name + implementation_workflow_revision/script_revision) and last_binding (last_bound_at, the kind, etc.) — the partial denormalization (last_bound_revision_id only) is the worst of both worlds because it forces the query port to peek at the actor’s internal state. I get that other Studio docs (GAgentRegistryCurrentStateDocument, ConnectorCatalogCurrentStateDocument) already use the Any state_root pattern; this PR is a chance to not double down on it for a brand-new doc.

5. Two-phase bind has no compensation

StudioMemberService.BindAsync:

  1. await _scopeBindingCommandPort.UpsertAsync(...) — creates the revision upstream
  2. await _memberCommandPort.RecordBindingAsync(...) — records on member

If step 2 throws, the platform-side service has a fresh revision but the member authority never observes the binding. Next bind will create yet another upstream revision and only that one will be recorded. Acceptable as “last write wins,” but at least add a comment, or compensate by having BindAsync derive last_binding lazily from the platform-side read model on read. Worth thinking about before this is the load-bearing path.

6. Smaller items

  • MemberImplementationKindMapper.ToWireName(StudioMemberLifecycleStage.Unspecified) returns "created". Silently lying about the type — should return empty or throw.
  • ProjectionStudioMemberQueryPort.ListAsync hardcodes Take = DefaultRosterPageSize (200) with no paging. Silently truncates large rosters; either expose paging or document the limit.
  • StudioMemberConventions.NormalizeMemberId doesn’t reject :. The comment in ActorDispatchStudioMemberCommandService.GenerateMemberId claims the format is “free of separators that StudioMemberConventions builds with (':')”, but a user-supplied memberId via CreateStudioMemberRequest.MemberId can contain : — making studio-member:scope:m:nested:bad ambiguous. Either reject : (and any other unsafe chars) in NormalizeMemberId, or stop accepting user-supplied ids and always generate them.
  • ApplyRenamed overwrites Description with evt.Description unconditionally — empty string in the rename event wipes existing description. Likely fine (rename payload always carries the full description), but worth a comment or a typed “did the caller intend to clear?” field. Low priority since rename has no caller yet (see point 3).
  • HandleCreated idempotency: same MemberId returns silently even if displayName / implementationKind / description differ from the existing ones. That’s probably the right call (don’t let a duplicate POST mutate stable identity), but worth surfacing as “first-write wins” — currently it’s implicit.

{
_bootstrap = bootstrap ?? throw new ArgumentNullException(nameof(bootstrap));
_dispatchPort = dispatchPort ?? throw new ArgumentNullException(nameof(dispatchPort));
_queryPort = queryPort ?? throw new ArgumentNullException(nameof(queryPort));
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dead injection. _queryPort is assigned but never read in this class. Either drop the dep + ctor parameter, or actually use it (e.g. for the duplicate-create check before dispatching StudioMemberCreatedEvent).

detail.Summary.DisplayName,
request);

var bindingResult = await _scopeBindingCommandPort.UpsertAsync(bindingRequest, ct);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bind flow gap vs issue #325.

The issue says bind should publish the current member revision to the member’s service — implying the implementation revision is already known to the member. Here, BindAsync reads the implementation spec out of the request body and forwards it directly to _scopeBindingCommandPort.UpsertAsync without ever calling _memberCommandPort.UpdateImplementationAsync or otherwise touching implementation_ref on the actor.

Net effect: StudioMemberState.ImplementationRef stays empty after bind, the BuildReady lifecycle stage is unreachable (since ApplyBound jumps straight to BindReady), and the read model can never answer “what is this member implemented by.” That contradicts the proto, the projector denormalization plan, and issue #325’s required implementationRef field.

Suggested fix: before delegating to the scope binding port, call _memberCommandPort.UpdateImplementationAsync (or fold the impl-ref persistence into the bind event). Then the member authority is the source of truth for implementation_ref and the lifecycle progression Created → BuildReady → BindReady becomes real.

bindingResult.RevisionId,
implementationKindWire,
ct);

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two-phase write with no compensation. If RecordBindingAsync throws after _scopeBindingCommandPort.UpsertAsync has succeeded, the platform-side has a real revision but the StudioMember actor doesn’t know about it. Next bind will create another upstream revision and only that one is recorded. Acceptable as “last write wins,” but worth either a comment or — preferably — making the read of last_binding lazy off the platform read model so the member doesn’t have to maintain its own copy.

string scopeId,
string memberId,
StudioMemberImplementationRefResponse implementation,
CancellationToken ct = default);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UpdateImplementationAsync has no caller in this PR. Either wire it into the bind flow (see comment on StudioMemberService.BindAsync), expose it via a PUT /members/{memberId}/implementation endpoint, or drop it. CLAUDE.md: 删除优先:空转发、重复抽象、无业务价值代码直接删除,不保留兼容空壳.

}

await PersistDomainEventAsync(evt);
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HandleRenamed (and StudioMemberRenamedEvent) has no producer in this PR — there is no service / endpoint that emits a rename event. The rename-safe-publishedServiceId invariant is unit-tested at the state machine level (good), but the rename surface itself is unreachable end-to-end. Either expose PUT /members/{memberId} to drive it, or remove until the next PR needs it.

StudioMemberLifecycleStage.Created => MemberLifecycleStageNames.Created,
StudioMemberLifecycleStage.BuildReady => MemberLifecycleStageNames.BuildReady,
StudioMemberLifecycleStage.BindReady => MemberLifecycleStageNames.BindReady,
_ => MemberLifecycleStageNames.Created,
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silent type lie: ToWireName(StudioMemberLifecycleStage.Unspecified) returns "created". If Unspecified ever ends up in the document, callers will see a member in created stage when in fact the projection is malformed. Should either return an empty string or throw. Same caution for the kind mapper if a value drifts.

},
],
Take = DefaultRosterPageSize,
};
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded Take = DefaultRosterPageSize (200) with no paging. Silently truncates if a scope ever has > 200 members — Studio frontend would just stop seeing some members with no signal. At minimum surface a nextPageToken or document the limit on the contract; preferably accept a paging cursor on ListAsync.


if (document.StateRoot != null && document.StateRoot.Is(StudioMemberState.Descriptor))
{
var state = document.StateRoot.Unpack<StudioMemberState>();
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading internal actor state from the read-model document. Tied to the state_root Any-pack issue on the proto: the query port should not need Unpack<StudioMemberState>(). Once implementation_ref and last_binding are denormalized into the document, this whole branch goes away.

if (string.IsNullOrEmpty(trimmed))
throw new ArgumentException("memberId is required.", nameof(memberId));
return trimmed;
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NormalizeMemberId doesn’t reject :. ActorDispatchStudioMemberCommandService.GenerateMemberId says the generated format is “URL-safe and free of separators that StudioMemberConventions builds with (':')”, but CreateStudioMemberRequest.MemberId lets a caller supply an arbitrary id, which goes through this normalize and back into BuildActorId. A memberId of m:nested produces actor id studio-member:scope-1:m:nested — ambiguous if anything ever parses it, and bypasses the immutable-id intent. Either reject : (and other unsafe chars) here, or stop accepting user-supplied ids and always generate them in the command service.

$"member already initialized with id '{State.MemberId}'.");
}

return;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idempotent re-create: same member_id returns silently even if display_name / implementation_kind / description in the new event differ from the persisted ones. Probably the right call (don’t let a duplicate POST mutate stable identity), but it’s a first-write-wins policy and right now it’s implicit. Worth either a comment, or making the mismatch on the non-identity fields throw the same way member_id mismatch does — otherwise the silent ignore can mask a real bug client-side.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 27, 2026

Codecov Report

❌ Patch coverage is 83.72740% with 117 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.41%. Comparing base (fb393b7) to head (9cc0ae6).
⚠️ Report is 8 commits behind head on dev.

Files with missing lines Patch % Lines
...Application/Studio/Services/StudioMemberService.cs 72.35% 29 Missing and 18 partials ⚠️
....Studio.Hosting/Endpoints/StudioMemberEndpoints.cs 78.04% 13 Missing and 5 partials ⚠️
...tion/QueryPorts/ProjectionStudioMemberQueryPort.cs 84.95% 5 Missing and 12 partials ⚠️
...on/Projectors/StudioMemberCurrentStateProjector.cs 82.35% 4 Missing and 8 partials ⚠️
...io.Application/Studio/Contracts/MemberContracts.cs 87.14% 9 Missing ⚠️
...ervices/ActorDispatchStudioMemberCommandService.cs 93.80% 2 Missing and 5 partials ⚠️
...dio/Services/StudioMemberCreateRequestValidator.cs 87.17% 3 Missing and 2 partials ⚠️
...ojection/Mapping/MemberImplementationKindMapper.cs 93.10% 1 Missing and 1 partial ⚠️
@@            Coverage Diff             @@
##              dev     #428      +/-   ##
==========================================
- Coverage   70.62%   70.41%   -0.21%     
==========================================
  Files        1179     1190      +11     
  Lines       85165    85882     +717     
  Branches    11172    11258      +86     
==========================================
+ Hits        60145    60473     +328     
- Misses      20679    21028     +349     
- Partials     4341     4381      +40     
Flag Coverage Δ
ci 70.41% <83.72%> (-0.21%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...udio/Abstractions/StudioMemberNotFoundException.cs 100.00% <100.00%> (ø)
...DependencyInjection/ServiceCollectionExtensions.cs 100.00% <100.00%> (ø)
...vatar.Studio.Hosting/StudioCapabilityExtensions.cs 76.00% <100.00%> (-24.00%) ⬇️
...oProjectionReadModelServiceCollectionExtensions.cs 91.53% <100.00%> (+0.26%) ⬆️
...DependencyInjection/ServiceCollectionExtensions.cs 100.00% <100.00%> (ø)
...tudioMemberCurrentStateDocumentMetadataProvider.cs 100.00% <100.00%> (ø)
...Models/StudioMemberCurrentStateDocument.Partial.cs 100.00% <100.00%> (ø)
...ojection/Mapping/MemberImplementationKindMapper.cs 93.10% <93.10%> (ø)
...dio/Services/StudioMemberCreateRequestValidator.cs 87.17% <87.17%> (ø)
...ervices/ActorDispatchStudioMemberCommandService.cs 93.80% <93.80%> (ø)
... and 5 more

... and 16 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…dpoints

Raises patch coverage on PR #428 by adding focused unit tests for the
previously-uncovered StudioMember files. Also tightens the actor-id
boundary so a malicious scope or member id cannot forge an actor id by
embedding the ':' separator.

Tests added (+40):
- ProjectionStudioMemberQueryPortTests — list/get scope-pinning, typed
  implementation_ref unpacking for workflow / script / gagent.
- ActorDispatchStudioMemberCommandServiceTests — create dispatches the
  canonical actor id and the rename-safe publishedServiceId, all three
  implementation kinds build the typed implementation_ref, RecordBinding
  rejects degenerate publishedServiceId / revisionId.
- StudioMemberCurrentStateProjectorTests — committed events upsert the
  denormalized roster fields; non-committed payloads no-op; null-arg guards.
- StudioMemberEndpointsTests — handler-level: scope-access guard
  short-circuits before service calls, GET/notfound mapping, BindAsync
  domain errors map to 400.
- StudioMemberReadModelMetadataTests — index name + read-model surface.
- Tighter conventions: NormalizeScopeId/NormalizeMemberId now reject the
  ':' actor-id separator so a caller cannot forge an actor id by colliding
  with another scope/member.
- ToWireName(StudioMemberLifecycleStage.Unspecified) returns empty string
  instead of silently mapping to "created" (lying to callers about state).

Wiring:
- Aevatar.Studio.Hosting / Aevatar.Studio.Projection now expose
  InternalsVisibleTo("Aevatar.Studio.Tests") so internal endpoints + the
  internal command service are testable without leaking the surface
  publicly.
- Endpoint handlers in StudioMemberEndpoints are internal static (was
  private static) for the same reason — they remain unreachable from
  outside the project except in tests.

Validation:
- dotnet build aevatar.slnx: 0 errors.
- dotnet test test/Aevatar.Studio.Tests: 275 passed (was 235 → +40).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor Author

@eanzhao eanzhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up review — additional issues

Further pass after the first review surfaced more concerns. Inline comments below; brief grouping here:

A. New layering coupling: Aevatar.Studio.Application now project-refs Aevatar.GAgents.StudioMember

This is the first agent package that Aevatar.Studio.Application takes a <ProjectReference> on. Compare to UserConfig / RoleCatalog / ConnectorCatalog — those agent packages are referenced only by Aevatar.Studio.Projection, not by Application. The Application layer there receives Application-shaped DTOs and the Projection layer maps them onto the proto event types.

In this PR, Aevatar.Studio.Application/Studio/Contracts/MemberImplementationKindMapper.cs imports Aevatar.GAgents.StudioMember and uses StudioMemberImplementationKind / StudioMemberLifecycleStage (proto-generated enums) directly. CLAUDE.md 上层依赖抽象,禁止…对具体实现的直接耦合 — Application should be agnostic of the proto representation; that representation is an Agent / boundary concern.

Suggested shape:

  • Define MemberImplementationKind / MemberLifecycleStage as Application-owned enums (or keep them as the existing wire-string constants).
  • Move the proto-enum mapper to the Projection layer (next to ActorDispatchStudioMemberCommandService and ProjectionStudioMemberQueryPort).
  • Drop the agent ProjectReference from Aevatar.Studio.Application.csproj.

This is a structural item, not a one-line fix — but worth resolving now while the surface is small. Once anything else lands on top of this, ripping it out gets harder.

B. HTTP status semantics: "member not found" surfaces as 400, not 404

StudioMemberService.BindAsync throws InvalidOperationException when the member doesn’t exist, and StudioMemberEndpoints.HandleBindAsync catches InvalidOperationException → 400 BadRequest with INVALID_STUDIO_MEMBER_BINDING. This conflates missing resource with bad input. Studio frontend can’t reliably distinguish “user typo” from “real 4xx”. Same shape for GetBindingAsync and (less obviously) the validation throws. Consider:

  • A typed MemberNotFoundException (or KeyNotFoundException) mapped to 404 in the endpoint.
  • Reserve 400 / INVALID_STUDIO_MEMBER_BINDING for actual bad-input cases (missing yaml, wrong kind etc.).

C. Defensive-actor: evt.PublishedServiceId is trusted blind

StudioMemberGAgent.HandleCreated persists whatever evt.PublishedServiceId came in. The dispatcher in ActorDispatchStudioMemberCommandService.CreateAsync derives it via StudioMemberConventions.BuildPublishedServiceId(memberId), but the actor itself doesn’t enforce that invariant. A future caller dispatching the event directly (or replaying an old event with a stale derivation rule) could persist a published_service_id that no longer matches BuildPublishedServiceId(memberId). Cheap fix: in ApplyCreated, rebuild from memberId and discard evt.PublishedServiceId, so the convention stays a single source of truth.

D. Unchecked enum casts in the query port

ProjectionStudioMemberQueryPort.ToSummary casts the int columns straight to enum:

var implementationKind = (StudioMemberImplementationKind)document.ImplementationKind;
var lifecycleStage = (StudioMemberLifecycleStage)document.LifecycleStage;

If the document was projected by a future build that knows new enum values, or by a corrupt write, the cast silently produces an out-of-range enum and downstream ToWireName returns "" (which is at least no longer the silent "created" lie after fix #2). Worth either an Enum.IsDefined guard with a logged warning, or storing the wire string directly in the document so the projection contract is the single source of truth.

E. StateRoot.Is(StudioMemberState.Descriptor) failure is silent

When state_root is null or carries a different descriptor, ToDetail silently drops implementation_ref and last_binding from the response — without distinguishing “doesn’t exist yet” from “projection is malformed.” Combined with the still-open state-mirror dump concern from review #1, this is the second-order cost of leaning on Any state_root for derived fields. Until the document is fully denormalized, at least surface a structured warning when the descriptor mismatches; a silent partial response makes downstream debugging painful.

F. Unused using in StudioMemberService.cs

Aevatar.GAgents.StudioMember is imported but no symbol from it is referenced in the file. Drop it (and once point A is addressed it stops being importable from this layer at all).

G. No length / character bounds on displayName, description, user-supplied memberId

CreateStudioMemberRequest accepts arbitrary content. A 10MB displayName persists end-to-end (actor state → projection → ES doc → API response). Doesn’t take much to be a soft attack surface or a noisy-neighbor bug. Add minimal length caps + character whitelist on the create path. (Memberid : rejection is now in place after fix #3 — extending it to other unsafe chars and a max length is the next step.)

<ProjectReference Include="..\Aevatar.Scripting.Infrastructure\Aevatar.Scripting.Infrastructure.csproj" />
<ProjectReference Include="..\workflow\Aevatar.Workflow.Application.Abstractions\Aevatar.Workflow.Application.Abstractions.csproj" />
<ProjectReference Include="..\workflow\Aevatar.Workflow.Core\Aevatar.Workflow.Core.csproj" />
<ProjectReference Include="..\..\agents\Aevatar.GAgents.StudioMember\Aevatar.GAgents.StudioMember.csproj" />
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New layering coupling: this is the first time Aevatar.Studio.Application takes a <ProjectReference> on an agent package. UserConfig, RoleCatalog, ConnectorCatalog all live behind Aevatar.Studio.Projection. Pulling Aevatar.GAgents.StudioMember into Application means the Application contract now depends on the proto-generated enums (StudioMemberImplementationKind, StudioMemberLifecycleStage) — see MemberImplementationKindMapper.cs.

CLAUDE.md 严格分层 / 上层依赖抽象,禁止…对具体实现的直接耦合. Suggest: move the proto-enum mapper to the Projection layer (next to ActorDispatchStudioMemberCommandService), have Application stay on the wire-string constants in MemberImplementationKindNames, and drop this ProjectReference.

@@ -0,0 +1,45 @@
using Aevatar.GAgents.StudioMember;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tied to the layering comment on the csproj — this mapper is the only consumer of Aevatar.GAgents.StudioMember.* in Application. Easiest fix: move the file to Aevatar.Studio.Projection/Adapters/, leave the wire-string constants (MemberImplementationKindNames) in Application, and have the Projection-side ActorDispatchStudioMemberCommandService and ProjectionStudioMemberQueryPort use the moved mapper.

@@ -0,0 +1,216 @@
using Aevatar.GAgentService.Abstractions;
using Aevatar.GAgentService.Abstractions.Ports;
using Aevatar.GAgents.StudioMember;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused using — no symbol from Aevatar.GAgents.StudioMember is referenced in this file. Remove.

ArgumentNullException.ThrowIfNull(request);

var detail = await _memberQueryPort.GetAsync(scopeId, memberId, ct)
?? throw new InvalidOperationException(
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrong HTTP semantics. member not found is a 404, not a 400. The endpoint catches InvalidOperationException → 400 with code INVALID_STUDIO_MEMBER_BINDING, which conflates missing resource with bad input. Throw a typed MemberNotFoundException (or KeyNotFoundException) and map it to 404 in StudioMemberEndpoints.HandleBindAsync / HandleGetBindingAsync.

Description = evt.Description,
ImplementationKind = evt.ImplementationKind,
ImplementationRef = null,
PublishedServiceId = evt.PublishedServiceId,
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defensive-actor invariant: evt.PublishedServiceId is trusted blind here. The dispatcher derives it via StudioMemberConventions.BuildPublishedServiceId(memberId), but the actor doesn’t re-derive or assert. A future caller (or a replayed historical event from a different derivation rule) could persist a published_service_id that no longer matches the convention.

Cheap fix: rebuild it inside ApplyCreated and ignore evt.PublishedServiceId, so the convention stays single-source. The event field becomes documentation rather than load-bearing.

private static StudioMemberSummaryResponse ToSummary(StudioMemberCurrentStateDocument document)
{
var implementationKind = (StudioMemberImplementationKind)document.ImplementationKind;
var lifecycleStage = (StudioMemberLifecycleStage)document.LifecycleStage;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unchecked int → enum casts. If a stale projection wrote a value outside the enum (corrupt index, future-version writer, etc.), the cast silently produces an out-of-range enum and ToWireName returns "". Either guard with Enum.IsDefined and log, or — cleaner — store the wire string in the document and drop the int columns entirely.

StudioMemberImplementationRefResponse? implementationRef = null;
StudioMemberBindingContractResponse? lastBinding = null;

if (document.StateRoot != null && document.StateRoot.Is(StudioMemberState.Descriptor))
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silent fall-through: if StateRoot is null or carries a non-StudioMemberState descriptor, the response loses implementation_ref + last_binding with no signal.

Combined with the still-open state-mirror-dump concern from review #1, this is the failure mode of leaning on Any state_root for derived fields. Until denormalization happens, at least log a structured warning when the descriptor mismatches — a silent partial response makes downstream debugging painful.

{
code = "INVALID_STUDIO_MEMBER_BINDING",
message = ex.Message,
});
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Catch-all for InvalidOperationException → 400. Combined with the “not found” throw on the service side, this returns 400 INVALID_STUDIO_MEMBER_BINDING for both missing member and bad bind payload. Studio frontend can’t reliably distinguish them. Differentiate by exception type and map missing-member to 404.

string DisplayName,
string ImplementationKind,
string? Description = null,
string? MemberId = null);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No bounds on DisplayName, Description, or MemberId. A 10MB displayName will persist all the way through (actor state → projection → ES doc → API response). Add length caps + character validation at the create path. With the : rejection in NormalizeMemberId now in place, extend to a stricter slug whitelist for MemberId (e.g. ^[A-Za-z0-9_-]{1,64}$) and a sane max length on DisplayName / Description.

eanzhao and others added 3 commits April 27, 2026 13:08
… guard

The fast-gates guard scans path:line:content for "(reader|document|projection)"
preceding ".ListAsync". The test file path
"ProjectionStudioMemberQueryPortTests.cs" matched the "projection" branch
on the path even though the actual call was port.ListAsync (the test
target's own method, not a document reader).

Renaming the test file drops the false positive without changing semantics
or coverage. The production class still lives under
src/Aevatar.Studio.Projection/QueryPorts/.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Issue #325 review touched eight separate concerns; this commit groups the
fixes that should ship in this PR.

Bind orchestration now traverses Created → BuildReady → BindReady
- StudioMemberService.BindAsync calls UpdateImplementationAsync with the
  resolved impl ref derived from the binding result, then RecordBindingAsync.
  The actor now sees impl_updated → bound, so implementation_ref is no
  longer always empty and BuildReady is reachable.
- ApplyImplementationUpdated downgrades BindReady → BuildReady when an
  out-of-band impl update happens, so a bound member doesn't keep claiming
  readiness against a stale revision. Inside a single BindAsync the
  Bound event re-promotes to BindReady, so the canonical happy path
  remains a single observable lifecycle.
- Two-phase write is documented as last-write-wins in the BindAsync xmldoc.

Hardened authority + state machine
- ApplyCreated re-derives publishedServiceId from
  StudioMemberConventions.BuildPublishedServiceId(memberId) instead of
  trusting evt.PublishedServiceId. The convention stays single-source
  even if a stale or hand-rolled event drifts.
- HandleCreated rejects re-create with same memberId but mismatched
  displayName / description / implementationKind. First-write-wins on
  identity is now explicit; mismatched non-identity fields no longer
  silently no-op.

Read-model denormalized — no Any-pack of internal state
- StudioMemberCurrentStateDocument drops state_root and gains typed
  denormalized fields: implementation_workflow_id /
  implementation_workflow_revision / implementation_script_id /
  implementation_script_revision / implementation_actor_type_name /
  last_bound_published_service_id / last_bound_revision_id /
  last_bound_implementation_kind / last_bound_at.
- implementation_kind / lifecycle_stage are now wire-stable strings
  (was: int enum codes) so a stale projection cannot round-trip an
  out-of-range value silently.
- ProjectionStudioMemberQueryPort no longer unpacks Any state_root;
  reads come straight from the denormalized fields.

HTTP semantics
- New StudioMemberNotFoundException; endpoints map it to 404
  STUDIO_MEMBER_NOT_FOUND. Validation errors stay 400. The bind/get
  endpoints used to conflate missing-member with bad-input.
- ListAsync accepts ?pageSize / ?pageToken; response carries
  NextPageToken. Default + max page size = 200, both bounded.

Layering
- MemberImplementationKindMapper moved from
  Aevatar.Studio.Application/Studio/Contracts to
  Aevatar.Studio.Projection/Mapping. Application no longer takes a
  ProjectReference on the StudioMember agent package, restoring the
  Domain → Application → Projection layering convention. Wire-string
  constants (MemberImplementationKindNames / MemberLifecycleStageNames)
  stay in Application — they don't depend on proto types.

Input safety
- StudioMemberInputLimits centralizes length caps (DisplayName ≤ 256,
  Description ≤ 2048, MemberId ≤ 64) plus a slug regex on caller-supplied
  member ids ([A-Za-z0-9][A-Za-z0-9_-]{0,63}). Stops 10MB displayNames
  from persisting all the way through to the index.

Cleanup
- Dropped unused `using Aevatar.GAgents.StudioMember;` from
  StudioMemberService — it never referenced any symbol from there.
- Dropped the dead `_queryPort` injection from
  ActorDispatchStudioMemberCommandService (already done upstream).

Tests
- Updated StudioMemberCurrentStateProjectorTests for the denormalized
  schema; added a script-impl coverage case.
- Updated StudioMemberQueryPortTests for the no-state_root document.
- Updated StudioMemberServiceBindingTests for the
  UpdateImplementationAsync recording path.
- New StudioMemberPRReviewFixesTests covers: ApplyCreated re-derives
  publishedServiceId; ApplyImplementationUpdated downgrades BindReady;
  ApplyImplementationUpdated upgrades Created; input limits surface;
  endpoints map StudioMemberNotFoundException to 404; roster carries
  NextPageToken.

Validation:
- dotnet build aevatar.slnx: 0 errors.
- dotnet test test/Aevatar.Studio.Tests: 287 passed (was 275 → +12).
- Targeted CI guards (projection_state_version, projection_state_mirror,
  projection_route_mapping, query_projection_priming, test_stability,
  proto_lint, cqrs_eventsourcing_boundary): all pass.
- ListAsync false-positive check on architecture_guards: empty.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor Author

@eanzhao eanzhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up review on the fix commit

Looked through 03a51d7a end-to-end. The fix is solid:

  • Lifecycle gap closed (BindAsync now does UpdateImplementation → RecordBinding, projector / actor reach BuildReady and BindReady legitimately).
  • state_root Any is gone, document is fully denormalized; query port no longer peeks inside actor state.
  • Convention single-source: ApplyCreated re-derives publishedServiceId from BuildPublishedServiceId(memberId).
  • Layering restored: Aevatar.Studio.Application.csproj no longer ProjectRefs the agent package; MemberImplementationKindMapper lives in Aevatar.Studio.Projection.Mapping.
  • Input bounds + slug pattern + 404 semantics all in place; pagination wired through pageSize / pageToken.
  • HandleCreated now rejects mismatched non-identity fields (first-write-wins on identity).
  • 287 Studio tests pass.

A handful of residual items below — none of them are blockers, but worth landing or at least filing.

1. StudioMemberNotFoundException : InvalidOperationException is brittle

The new exception subclasses InvalidOperationException. That works only because every endpoint puts the StudioMemberNotFoundException catch ahead of the InvalidOperationException catch:

catch (StudioMemberNotFoundException ex) { return NotFound(ex); }
catch (InvalidOperationException ex) { return BadRequest(...); }

Reorder those by accident in a future refactor and the 404 silently downgrades to a 400 with INVALID_STUDIO_MEMBER_BINDING. Same brittleness shows up in tests: BindAsync_ShouldFail_WhenMemberDoesNotExist asserts ThrowAsync<InvalidOperationException>() (line 134), so it passes regardless of whether the throw is the typed StudioMemberNotFoundException or any random IOEx (e.g. the 'has no publishedServiceId' invariant violation). The test wouldn’t catch a regression that swapped the typed exception for a plain throw.

Suggest one of:

  • Subclass Exception (or KeyNotFoundException) instead of InvalidOperationException, and update the test to ThrowAsync<StudioMemberNotFoundException>().
  • Or keep the inheritance but tighten the test.

2. The bind-lifecycle wiring is uncovered by StudioMemberServiceBindingTests

This is the main fix on the PR — BindAsync calls UpdateImplementationAsync between the scope upsert and the RecordBinding so the actor walks Created → BuildReady → BindReady. The recording stub captures it (RecordedImplementationUpdates, line 232) but no test asserts on it. The state-machine half of the fix is locked in by StudioMemberPRReviewFixesTests, but the orchestration half can be silently undone — drop the UpdateImplementationAsync call from BindAsync and every existing test in StudioMemberServiceBindingTests still passes.

Fix: in each of BindAsync_Workflow… / …_Script… / …_GAgent…, assert commandPort.RecordedImplementationUpdates contains exactly one entry with the right kind + ref shape, and assert it’s recorded before RecordedBindings (so the lifecycle order is locked in too).

3. ApplyImplementationUpdated lets ImplementationKind drift post-create

next.ImplementationKind = evt.ImplementationKind;
next.ImplementationRef = evt.ImplementationRef?.Clone();

HandleCreated now rejects mismatched-kind re-creates (good), but HandleImplementationUpdated happily overwrites State.ImplementationKind with whatever the event says. So a script-kind member can be silently mutated into a workflow-kind member by dispatching StudioMemberImplementationUpdatedEvent { ImplementationKind = Workflow, … }. The bind orchestration today never does this, but the actor protocol allows it.

Either:

  • Drop ImplementationKind from StudioMemberImplementationUpdatedEvent (use State.ImplementationKind), or
  • Have HandleImplementationUpdated reject when evt.ImplementationKind != State.ImplementationKind.

The second is safer if there’s any chance the event field has external consumers; the first is cleaner.

4. HandleGetBindingAsync returns 404 with two different shapes

var binding = await memberService.GetBindingAsync(scopeId, memberId, ct);
return binding == null ? Results.NotFound() : Results.Ok(binding);

Now that GetBindingAsync throws StudioMemberNotFoundException for missing members, the bare Results.NotFound() only fires for member exists, never bound. So callers see two 404 shapes: { code: STUDIO_MEMBER_NOT_FOUND, … } for unknown member vs. empty 404 body for unbound member. Frontend can’t distinguish them from status code alone.

“Member exists but never bound” isn’t a missing resource; consider returning 200 { lastBinding: null } (the request is satisfied) or a typed STUDIO_MEMBER_NOT_BOUND_YET 404 body. Either way, distinguishable.

5. StudioMemberService.CreateAsync does no validation; it pass-throughs

Length caps + slug pattern live inside ActorDispatchStudioMemberCommandService.ValidateUserSuppliedMemberId (Projection layer). If anyone swaps the command port — for tests, for an in-memory variant, for a future migration — those bounds disappear silently. Application is the natural home for input shape validation; today it’s only wiring.

Minimal fix: lift ValidateUserSuppliedMemberId, the displayName cap, and the description cap to StudioMemberInputLimits (already in Application) and have StudioMemberService.CreateAsync call them before delegating. The command port can then trust the input it receives.

6. HandleCreateAsync has a dead catch (StudioMemberNotFoundException)

CreateAsync cannot legitimately throw StudioMemberNotFoundException — there’s no member to be missing yet. Delete the catch in HandleCreateAsync (line 55–58). Same comment isn’t a problem on the bind / get-binding handlers, where the catch is load-bearing.

7. Naming: StudioMemberRosterPageRequest.Cursor vs HTTP pageToken

Minor — the public HTTP contract is pageToken (and the response field is NextPageToken), the internal request DTO is Cursor. Either the contract field renames to Cursor or the DTO renames to PageToken. Pick one to keep search-grepability sane.

/// corresponding member document in the requested scope. Endpoints map this
/// to HTTP 404 — distinct from validation errors that map to 400.
/// </summary>
public sealed class StudioMemberNotFoundException : InvalidOperationException
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subclassing InvalidOperationException makes the 404 / 400 distinction depend on catch ordering in every endpoint. A reorder turns 404 into 400 silently. Plus, BindAsync_ShouldFail_WhenMemberDoesNotExist asserts ThrowAsync<InvalidOperationException>() and so passes regardless of whether the typed StudioMemberNotFoundException is preserved.

Suggest base on Exception (or KeyNotFoundException) and update the binding test to assert on StudioMemberNotFoundException directly.

response.PublishedServiceId.Should().Be(PublishedServiceId);
response.RevisionId.Should().Be(bindingPort.IssuedRevisionId);
response.ImplementationKind.Should().Be(MemberImplementationKindNames.Workflow);
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coverage gap: the new bind-lifecycle wiring (UpdateImplementationAsyncRecordBindingAsync) is the headline fix, but it’s not asserted here. Dropping the UpdateImplementationAsync call from BindAsync would still leave every test in this file green.

RecordedImplementationUpdates is captured on the stub (line 232) — assert on it: exactly one entry, the right kind + ref shape, and recorded before the RecordedBindings entry so the BuildReady → BindReady order is locked in. Apply the same to the script / gagent tests.

StudioMemberState state, StudioMemberImplementationUpdatedEvent evt)
{
var next = state.Clone();
next.ImplementationKind = evt.ImplementationKind;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ImplementationKind drift post-create.

HandleCreated now rejects re-create with a different kind (good), but ApplyImplementationUpdated overwrites State.ImplementationKind with evt.ImplementationKind unconditionally — so the kind invariant the create event was supposed to lock can still be silently violated by dispatching an StudioMemberImplementationUpdatedEvent { ImplementationKind = …different… }.

Either drop ImplementationKind from the implementation-updated event (use State.ImplementationKind), or have HandleImplementationUpdated reject when evt.ImplementationKind != State.ImplementationKind. The bind orchestration doesn’t mutate kind today, but the actor protocol allows it.

try
{
var binding = await memberService.GetBindingAsync(scopeId, memberId, ct);
return binding == null ? Results.NotFound() : Results.Ok(binding);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two different 404 shapes for the same endpoint. After the StudioMemberNotFoundException change, the binding == null branch only fires when the member exists but has never been bound. So callers see:

  • 404 { code: STUDIO_MEMBER_NOT_FOUND, … } for unknown member
  • bare 404 (empty body) for member-exists-no-binding

Frontend can’t distinguish them from status code alone, and “exists but never bound” is arguably not a missing-resource case at all. Suggest 200 with lastBinding: null, or a typed STUDIO_MEMBER_NOT_BOUND_YET body — either way, distinguishable.

var summary = await memberService.CreateAsync(scopeId, request, ct);
return Results.Created($"/api/scopes/{scopeId}/members/{summary.MemberId}", summary);
}
catch (StudioMemberNotFoundException ex)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dead catch — CreateAsync cannot legitimately throw StudioMemberNotFoundException (there is no member to be missing yet). Remove. The corresponding catches on the bind / get-binding handlers are load-bearing and should stay.

CancellationToken ct = default)
{
ArgumentNullException.ThrowIfNull(request);
return _memberCommandPort.CreateAsync(scopeId, request, ct);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Validation lives in the wrong layer.

Length caps + slug pattern live in ActorDispatchStudioMemberCommandService.ValidateUserSuppliedMemberId and inline in the Projection-layer command service. If anyone swaps the command port — alternate impl, in-memory test variant, migration tool — the bounds disappear silently.

StudioMemberInputLimits is already in Application (MemberContracts.cs). Lift the actual displayName.Length > MaxDisplayNameLength / description.Length > MaxDescriptionLength / slug-regex checks here in CreateAsync, and let the command port trust its input.


public sealed record StudioMemberRosterPageRequest(
int? PageSize = null,
string? Cursor = null);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming inconsistency: HTTP query param is pageToken, response field is NextPageToken, internal DTO field is Cursor. Pick one — PageToken for the DTO is probably the lower-impact rename.

Follow-on review hardening:

ImplementationKind locked at create
- HandleImplementationUpdated rejects events whose ImplementationKind
  disagrees with the persisted kind (Unspecified is the defensive default
  meaning "carry existing"). A Script member can no longer be silently
  mutated into a Workflow member by dispatching an UpdatedEvent with a
  different kind.
- ApplyImplementationUpdated explicitly does NOT touch State.ImplementationKind
  even on hand-rolled / replayed events whose payload disagrees, so the
  invariant holds at the apply step too.

StudioMemberNotFoundException : KeyNotFoundException
- Was InvalidOperationException, which made the endpoint catch order
  load-bearing — reordering catch blocks could silently downgrade a 404
  to a 400. KeyNotFoundException is type-disjoint from
  InvalidOperationException, so HandleCreate / HandleList / HandleGet can
  drop the StudioMemberNotFoundException catch (they don't throw it
  anyway), and HandleBind / HandleGetBinding still catch it explicitly
  and map to 404.

Renamed StudioMemberRosterPageRequest.Cursor → PageToken
- Kept the wire field consistent with HTTP query parameter ?pageToken
  and StudioMemberRosterResponse.NextPageToken so the frontend round-trips
  the same name everywhere.

Tests
- Added ApplyImplementationUpdated_ShouldNotMutateImplementationKind: an
  adversarial event tries to switch Script → Workflow; the apply step
  preserves Script.
- Strengthened all three BindAsync_* tests: assert OperationsInOrder ==
  ["UpdateImplementation", "RecordBinding"] so a regression that swaps
  the order or drops the impl update is caught.
- BindAsync_ShouldFail_WhenMemberDoesNotExist now asserts the typed
  StudioMemberNotFoundException so a regression that swaps it for a plain
  InvalidOperationException is caught (endpoints map the typed one to
  404, untyped IOEx to 400).

Validation
- dotnet build aevatar.slnx: 0 errors.
- dotnet test test/Aevatar.Studio.Tests: 288 passed (was 287 → +1).
- test_stability_guards: pass.
- ListAsync false-positive scan: empty.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor Author

@eanzhao eanzhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latest head fixes several earlier items. I left two additional comments on actor-level invariants that are still enforceable through direct command/event paths.

// a Script member can't be silently mutated into a Workflow member by
// dispatching an UpdatedEvent with a different kind. Unspecified is
// accepted as "carry the existing kind" (defensive default).
if (evt.ImplementationKind != StudioMemberImplementationKind.Unspecified
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ImplementationKind is now locked, but the payload shape is still unchecked. Because StudioMemberImplementationRef is not a oneof, a caller can send ImplementationKind = Unspecified (or even the matching kind) with the wrong branch populated, e.g. a Script member carrying Workflow. ApplyImplementationUpdated will clone that ref and the projector can later publish implementation_kind = "script" with workflow fields set. Please validate that exactly the branch matching the existing State.ImplementationKind is populated before persisting the event.

throw new InvalidOperationException("member not yet created.");
}

await PersistDomainEventAsync(evt);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recordBinding can still be dispatched directly after create and will make the actor BindReady with no resolved ImplementationRef; StudioMemberGAgentStateTests.Bound_ShouldCaptureLastBindingAndAdvanceLifecycle currently locks in that path. The service now sends UpdateImplementationAsync before RecordBindingAsync, but the actor authority should enforce the invariant too: reject binding unless a matching implementation ref is already present, and also verify evt.PublishedServiceId == State.PublishedServiceId and the event kind matches the locked member kind. Otherwise a malformed command can publish a read model that says the member is bind-ready while it has no implementation, or records another service id as this member's binding.

Copy link
Copy Markdown
Contributor Author

@eanzhao eanzhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up review on c934ec97

Verified: kind-lock + apply-immutability, StudioMemberNotFoundException : KeyNotFoundException, dead catch in HandleCreateAsync removed, Cursor → PageToken, OperationsInOrder ordering assertions on all three bind tests, typed BindAsync_ShouldFail_WhenMemberDoesNotExist assertion, plus the new ApplyImplementationUpdated_ShouldNotMutateImplementationKind. 288 Studio tests pass.

Two items from review #3 are still not addressed, and the same shape mismatch shows up in a third endpoint that wasn’t flagged before. Inline below.

1. (still open) HandleGetBindingAsync returns two different 404 shapes

var binding = await memberService.GetBindingAsync(scopeId, memberId, ct);
return binding == null ? Results.NotFound() : Results.Ok(binding);   // ← bare 404
...
catch (StudioMemberNotFoundException ex) { return NotFound(ex); }    // ← typed 404

After the StudioMemberNotFoundException change, the binding == null branch only fires when the member exists but has never been bound. So callers see:

  • 404 { code: STUDIO_MEMBER_NOT_FOUND, scopeId, memberId, message } for unknown member
  • bare 404 (empty body) for member-exists-no-binding

Frontend can’t distinguish them from status code alone, and "exists but never bound" arguably isn’t a missing-resource case at all (the GET succeeded, the member is there, it just doesn’t have a binding yet). Suggestions, in order of preference:

  1. 200 OK { lastBinding: null } — request was satisfied, no binding to return.
  2. 204 No Content.
  3. Keep 404 but emit a typed STUDIO_MEMBER_NOT_BOUND_YET body so frontends can branch on the code field.

2. (still open) Input validation lives in the wrong layer

StudioMemberInputLimits.MaxDisplayNameLength / MaxDescriptionLength / MaxMemberIdLength / MemberIdPattern are defined in Aevatar.Studio.Application/Studio/Contracts/MemberContracts.cs, but the only enforcement lives in Aevatar.Studio.Projection/CommandServices/ActorDispatchStudioMemberCommandService.cs (lines 50, 57, 210, 215). StudioMemberService.CreateAsync (Application) is a single-line pass-through:

public Task<StudioMemberSummaryResponse> CreateAsync(
    string scopeId, CreateStudioMemberRequest request, CancellationToken ct = default)
{
    ArgumentNullException.ThrowIfNull(request);
    return _memberCommandPort.CreateAsync(scopeId, request, ct);
}

If anyone swaps the command port — alternate impl, in-memory variant, migration tool — every input bound disappears with no signal. Move the length / slug checks into StudioMemberService.CreateAsync (the constants are already in Application, that’s a no-import lift), and let the command port trust its input. Belongs to the layer that owns the Application contract.

3. (new) HandleGetAsync returns a bare 404 for missing member, inconsistent with bind / get-binding

var detail = await memberService.GetAsync(scopeId, memberId, ct);
return detail == null ? Results.NotFound() : Results.Ok(detail);

HandleBindAsync and HandleGetBindingAsync return the typed STUDIO_MEMBER_NOT_FOUND body for missing members; HandleGetAsync returns a bare 404 for the same condition. Callers get different shapes for the same logical "member not found."

Easiest fix: have IStudioMemberService.GetAsync throw StudioMemberNotFoundException instead of returning null (matches GetBindingAsync already), let the endpoint catch it and call NotFound(ex) for a typed body. The detail == null branch then becomes unreachable and can be removed.

This also frees up HandleGetBindingAsync to use 200/204 for the never-bound case (item 1) without the comparison being lopsided across endpoints.

try
{
var binding = await memberService.GetBindingAsync(scopeId, memberId, ct);
return binding == null ? Results.NotFound() : Results.Ok(binding);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still open from review #3.

Two different 404 shapes for one endpoint:

  • binding == null → bare 404, fires only when member exists but was never bound
  • catch (StudioMemberNotFoundException) → typed 404 { code: STUDIO_MEMBER_NOT_FOUND, ... }, fires when member is missing

Frontend can’t distinguish them from status code alone, and the "exists but never bound" case isn’t really a missing resource. Suggest 200 with lastBinding: null (preferred), 204, or a typed STUDIO_MEMBER_NOT_BOUND_YET 404 body.

CancellationToken ct = default)
{
ArgumentNullException.ThrowIfNull(request);
return _memberCommandPort.CreateAsync(scopeId, request, ct);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still open from review #3.

StudioMemberInputLimits lives in Application (MemberContracts.cs), but the only callers that enforce the bounds live in Projection (ActorDispatchStudioMemberCommandService.CreateAsync lines 50/57/210/215). StudioMemberService.CreateAsync is a one-line pass-through with no validation. Swap the command port (alt impl / in-memory test / migration tool) and every length cap + slug pattern silently disappears.

Lift the length checks + slug regex up into this method (the constants are already in this layer; it’s a no-import refactor) and let the command port trust the input shape it receives.

try
{
var detail = await memberService.GetAsync(scopeId, memberId, ct);
return detail == null ? Results.NotFound() : Results.Ok(detail);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New finding — inconsistency from the partial fix to review #3 item 4.

HandleBindAsync and HandleGetBindingAsync both return the typed STUDIO_MEMBER_NOT_FOUND body for missing members; HandleGetAsync returns a bare Results.NotFound() for the same condition. Three endpoints now have three different 404 shapes (this one bare, get-binding has two, bind has one typed). Frontend gets different bodies for the same logical "member not found" depending on which endpoint hit it.

Fix: have IStudioMemberService.GetAsync throw StudioMemberNotFoundException for missing members (matches GetBindingAsync already does this) and add the same typed catch + NotFound(ex) here. The detail == null branch becomes unreachable and can go.

… to Application

Three follow-up review concerns from #428:

1. Two 404 shapes in GET /members/{memberId}/binding
   Before: bare 404 NotFound for "exists but never bound" + typed
   STUDIO_MEMBER_NOT_FOUND for missing member. Frontend couldn't tell
   them apart from status alone.

   After: 200 { lastBinding: null } for "exists but never bound" +
   typed 404 STUDIO_MEMBER_NOT_FOUND for missing member. New wrapper
   StudioMemberBindingViewResponse keeps the response always a JSON
   object so the frontend dispatches on `lastBinding === null` instead
   of overloaded HTTP status.

2. Validation in the wrong layer
   Length caps + slug pattern enforcement was sitting in
   ActorDispatchStudioMemberCommandService (Projection). StudioMemberService
   .CreateAsync was a one-line pass-through. Swapping the command port
   would silently drop the bounds.

   New StudioMemberCreateRequestValidator (Application/Studio/Services/)
   centralizes the bounds; StudioMemberService.CreateAsync calls it
   before dispatching. The Projection-layer command service is now
   intentionally lenient — it trusts already-validated input. Fits
   CLAUDE.md `严格分层 / 上层依赖抽象`.

3. HandleGetAsync 404 shape inconsistent
   Before: bare Results.NotFound() for missing member while bind /
   get-binding returned the typed STUDIO_MEMBER_NOT_FOUND body.
   Three endpoints, three different 404 shapes for the same
   logical condition.

   After: IStudioMemberService.GetAsync throws
   StudioMemberNotFoundException for missing member (matches
   GetBindingAsync). HandleGetAsync now catches the typed exception
   and returns the same 404 body — three endpoints, one 404 shape.

Tests
- New StudioMemberCreateRequestValidatorTests (7): empty / over-cap
  display name, over-cap description, slug-pattern violations
  (whitespace + ':' separator), accepted shapes (omitted memberId,
  valid slug). InternalsVisibleTo("Aevatar.Studio.Tests") added on
  Aevatar.Studio.Application so the internal validator stays internal
  publicly.
- StudioMemberEndpointsTests:
  HandleGetAsync_ShouldReturnTyped404_WhenMemberMissing replaces the
  bare-404 assertion.
  HandleGetBindingAsync_ShouldReturnOk_WithNullBinding_WhenMember-
  ExistsButNeverBound asserts the new 200 + null wrapper shape; the
  existing "binding present" test now asserts the wrapper too.
- ActorDispatchStudioMemberCommandServiceTests: dropped the
  CreateAsync_ShouldRejectEmptyDisplayName test (validation moved to
  Application; the validator-level coverage replaces it). Replaced with
  a comment pointer to the new validator tests.

Validation:
- dotnet build aevatar.slnx: 0 errors.
- dotnet test test/Aevatar.Studio.Tests: 294 passed (was 288 → +6 net).
- targeted guards (test_stability, projection_state_*, proto_lint, ListAsync false-positive): pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@eanzhao eanzhao merged commit 7bef42f into dev Apr 27, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Backend: member-first Studio APIs with stable published service per member

1 participant