enhance(daily): structured-section prompt with omit-empty rule + repo-aware queries (#423 §A) by eanzhao · Pull Request #458 · aevatarAI/aevatar

eanzhao · 2026-04-27T10:14:48Z

Summary

Implements §A (content depth) of #423. The current /daily prompt is a single
paragraph that asks for "3-6 concise bullet points" — produces thin reports and
pads when sources are silent. This PR rewrites TryBuildDailyReportSpec to treat
the system prompt as a fetch-and-summarize specification, not a freeform brief:

Explicit section order with per-section line budgets: Title (1) / Shipped (≤6) /
In flight (≤6) / Reviews (≤4) / Issues (≤4) / CI (≤3) / Trend (1, optional) /
Blockers (1).
"Omit the section entirely if empty" rule (header AND body) — the regression
we're guarding against is the LLM padding silent sources with no activity in this area filler. Empty-day fallback returns only Title + No measurable activity in the last 24h.
Repo-aware query suggestions: when repositories= is provided, the prompt
steers the LLM to per-repo /repos/{owner}/{repo}/... calls and explicitly refuses
the /search/* path (which doesn't filter to a repo allowlist cleanly). When the
allowlist is empty, the suggestion list expands from 3 search queries to 6 (split
PRs / reviews / issues opened / issues closed / issues commented).
Honest empty-source handling: 4xx/5xx/empty → treat as zero, do not retry, do
not invent fallback data.

Two new tests in AgentBuilderToolTests pin the structured-section schema and the
omit-empty rule so future copy edits don't silently regress the spec.

Out of scope (follow-ups for #423)

This PR addresses §A and the corresponding acceptance test. The remaining items
stay open under #423:

§B Option 2 (streaming-edit delivery) — preferred per the issue. Needs a new
SkillRunnerStreamingReplySink plus Lark PATCH /open-apis/im/v1/messages/{id}
integration without a reply token. Larger surface; better as its own PR.
§B Option 1 (batched delivery) — fallback shape; only worth landing if
Option 2 hits a blocker.
§C failure-notification cross-tenant fallback — TrySendFailureAsync going
through the same s/api-lark-bot proxy that just rejected. Tracked in enhance(daily): richer report content + progressive delivery (streaming-edit or batched) #423 §C and
scorecard row C2; requires capturing a channel-bot fallback target at agent-create
time.
Existing agents with frozen old prompt in actor state — SkillRunnerInitialized
freezes SkillContent at create time, so this PR only takes effect for newly
created agents. The structural fix (versioned prompt templates) is tracked as refactor(daily-prompt): versioned daily report prompt templates (prerequisite for #423 enrichment) #450.
Migration path for users today: /agents → Delete → /daily.

The smoke test in the issue's acceptance ("commits authored empty but reviews
non-empty → Reviews renders, Shipped omitted") is a manual run-against-real-GitHub
check; the automated pinning test guards the prompt itself, not the LLM's adherence.

Test plan

dotnet test test/Aevatar.GAgents.ChannelRuntime.Tests/Aevatar.GAgents.ChannelRuntime.Tests.csproj — 475 passed (includes 2 new + 30 existing daily-report tests).
bash tools/ci/architecture_guards.sh — all guards pass.
Manual smoke: create a fresh /daily agent, trigger via /run-agent, verify
the output uses the new section structure and that empty sections are omitted.

Refs #423.

…po-aware queries The single-paragraph "3-6 bullets" daily prompt produced thin reports and padded when sources were silent. Treat the prompt as a fetch-and-summarize spec: explicit section order (Title / Shipped / In flight / Reviews / Issues / CI / Trend / Blockers), hard per-section line budgets, and an "omit the section entirely if empty" rule. Switch the suggested GitHub queries to per-repo endpoints when `repositories=` is provided so the LLM doesn't fold the allowlist back into a global /search/* (which doesn't filter by repo cleanly). Pin the new schema and the empty-section rule with two AgentBuilderToolTests checks so future copy edits don't silently regress the spec. Refs #423 §A. Streaming-edit delivery (§B Option 2), batched delivery (§B Option 1), and the failure-notification cross-tenant fallback (§C) remain follow-ups. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f8113c5dfd

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-27T10:17:07Z

+                .AppendLine("- GET /repos/{owner}/{repo}/pulls?state=closed&per_page=20               // filter merged_at >= {iso_date} client-side for shipped")
+                .AppendLine("- GET /repos/{owner}/{repo}/pulls?state=open&per_page=20                 // filter author = {username} client-side for in flight")


Filter shipped PRs by the target user in repo mode

This per-repo shipped query currently pulls all closed PRs and only asks for a merged_at cutoff, which means active repos will include other contributors’ merges in a user-specific daily report. In repositories= mode this can systematically misattribute shipped work to the configured user. Please also filter by author (for example user.login == {username} client-side, or a query form that includes author:{username}) before summarizing the Shipped section.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-27T10:17:07Z

+                .AppendLine("- GET /search/issues?q=author:{username}+is:pr+is:merged+merged:>={iso_date}      // shipped PRs")
+                .AppendLine("- GET /search/issues?q=author:{username}+is:pr+is:open                            // in flight")
+                .AppendLine("- GET /search/issues?q=reviewed-by:{username}+updated:>={iso_date}                // reviews")
+                .AppendLine("- GET /search/issues?q=author:{username}+is:issue+created:>={iso_date}            // issues opened")
+                .AppendLine("- GET /search/issues?q=author:{username}+is:issue+is:closed+closed:>={iso_date}   // issues closed")


Provide a commit data source for the Shipped section

The new schema says Shipped includes “commits to the default branch,” but the suggested global calls now only cover PR/issues searches and omit any commit endpoint. When a user ships direct commits without merged PRs, the report has no instructed source for that activity and will undercount output. Add an explicit commit query/source (and the repo-scoped equivalent) or remove commit claims from the section contract.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-27T10:17:07Z

+                .AppendLine("- GET /repos/{owner}/{repo}/pulls?state=closed&per_page=20               // filter merged_at >= {iso_date} client-side for shipped")
+                .AppendLine("- GET /repos/{owner}/{repo}/pulls?state=open&per_page=20                 // filter author = {username} client-side for in flight")
+                .AppendLine("- GET /search/issues?q=reviewed-by:{username}+repo:{owner}/{repo}+updated:>={iso_date}")
+                .AppendLine("- GET /search/issues?q=author:{username}+repo:{owner}/{repo}+is:issue+updated:>={iso_date}")


Include commenter activity in repo-scoped issue guidance

In repo-allowlist mode, the Issues guidance only queries author:{username} updates, but the section definition requires issues “opened, closed, or commented on” by the user. This drops the common case where the user comments on issues they did not author, so repo-scoped reports can miss meaningful issue activity. Add a per-repo commenter:{username} query (and keep opened/closed splits as needed) to align with the stated section behavior.

Useful? React with 👍 / 👎.

codecov · 2026-04-27T10:32:26Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 70.90%. Comparing base (21f6112) to head (e408253).
⚠️ Report is 10 commits behind head on dev.

@@            Coverage Diff             @@
##              dev     #458      +/-   ##
==========================================
+ Coverage   70.59%   70.90%   +0.30%     
==========================================
  Files        1208     1209       +1     
  Lines       86806    87045     +239     
  Branches    11369    11411      +42     
==========================================
+ Hits        61284    61720     +436     
+ Misses      21094    20877     -217     
- Partials     4428     4448      +20

Flag	Coverage Δ
ci	`70.90% <ø> (+0.30%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 14 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

eanzhao · 2026-04-27T13:19:56Z

Reviewing §A changes. A few issues:

1. Test gap: Section 7 (Trend) not asserted despite claiming full coverage

AgentBuilderToolTests.cs — the pinning test says "Structural sections must all be named in order" but only asserts sections 1–6 and 8, skipping section 7 entirely:

https://github.com/Aevatar-AI/aevatar/blob/feat/2026-04-27_daily-richer-content-prompt/test/Aevatar.GAgents.ChannelRuntime.Tests/AgentBuilderToolTests.cs#L75-L85

If the omission is intentional (Trend is optional), the comment should say so. Otherwise, add:

skillContent.Should().Contain("7. Trend");

2. No CI query guidance when repo allowlist is empty

The output schema defines §6 CI but the no-repo query suggestion list has zero GitHub Actions / check-run endpoints:

https://github.com/Aevatar-AI/aevatar/blob/feat/2026-04-27_daily-richer-content-prompt/agents/Aevatar.GAgents.ChannelRuntime/AgentBuilderTemplates.cs#L142-L149

The per-repo branch correctly includes /repos/{owner}/{repo}/actions/runs, but the global-search branch has nothing equivalent. The LLM has no guidance on how to populate the CI section in the no-repo case, so it will either skip it or hallucinate. Either add a search endpoint for CI (e.g. /search/commits?q=...+status:failure as a rough proxy) or explicitly note that CI is only available when repos are pinned.

3. `{default_branch}` substitution variable undocumented

In the per-repo query list, the CI query uses {default_branch} but it's never explained alongside {iso_date} and {username}:

https://github.com/Aevatar-AI/aevatar/blob/feat/2026-04-27_daily-richer-content-prompt/agents/Aevatar.GAgents.ChannelRuntime/AgentBuilderTemplates.cs#L154

The LLM may leave the literal {default_branch} in the URL or guess main/master. Consider either documenting the substitution (e.g. "replace {default_branch} with the repo's default branch — usually main or master") or switching to the unfiltered /actions/runs?per_page=10 and filtering client-side.

Otherwise the extraction into BuildDailyReportSkillPrompt and the omit-empty semantics look solid.

eanzhao

Found one additional issue while reviewing PR 458.

eanzhao · 2026-04-27T13:20:01Z

+        {
+            prompt
+                .AppendLine("Repository allowlist provided — run these per-repo (replace `{owner}/{repo}` with each entry, substitute `{iso_date}` with the start of the 24h window in ISO 8601 UTC). Do NOT collapse into one global query.")
+                .AppendLine("- GET /repos/{owner}/{repo}/pulls?state=closed&per_page=20               // filter merged_at >= {iso_date} client-side for shipped")


P2: This endpoint is not a reliable 24h merge-time source. /repos/{owner}/{repo}/pulls?state=closed&per_page=20 returns only one page of closed PRs, and the list API is not keyed by merged_at; filtering that small page client-side can miss PRs merged in the last 24h when they were created earlier or fall past the first page in active repos. Since this feeds the Shipped section, repo-scoped reports can undercount real shipped work. Prefer a query source keyed by merge time, e.g. repo:{owner}/{repo} author:{username} is:pr is:merged merged:>={iso_date}, or explicitly paginate/sort enough to make the client-side merged_at filter complete.

eanzhao · 2026-04-27T13:22:34Z

Additional review — supplementary to the three issues already raised:

4. `(always last)` for §8 Blockers conflicts with empty-day fallback

Line 134 in AgentBuilderTemplates.cs:

.AppendLine("8. Blockers (1 line, always last) — ...")

But two lines later (136) the empty-day rule says:

return ONLY the title line followed by `No measurable activity in the last 24h.` and nothing else.

On an empty day, the LLM is told both (a) "Blockers is always last" and (b) "return nothing else". A weaker model could synthesise both and emit Blockers anyway. Consider qualifying Blockers as: (always last, unless the entire report is empty-day).

5. Per-repo `?state=closed` shipped query pulls closed-but-not-merged PRs

Line 157:

.AppendLine("- GET /repos/{owner}/{repo}/pulls?state=closed&per_page=20               // filter merged_at >= {iso_date} client-side for shipped")

?state=closed returns PRs closed without merge (e.g. rejected). The instruction says "filter merged_at" but does not call out that a null merged_at means "not shipped". Without explicit guidance, the LLM may count closed-unmerged PRs in the Shipped section. Suggestion: add a note like "exclude PRs where merged_at is null" or switch to GET /repos/{owner}/{repo}/pulls?state=closed&sort=updated&direction=desc with "only count PRs whose merged_at falls in the window".

6. `{username}` substitution not documented alongside `{iso_date}` / `{owner}/{repo}`

In the no-repo query suggestions (lines 146–151), {username} appears in every URL but the substitution instruction (line 145) only documents {iso_date}. Given that issue #3 flags {default_branch} as undocumented, {username} has the same problem. Add a brief mention, e.g.:

(substitute `{username}` with {normalizedUser}, `{iso_date}` with the start of the 24h window in ISO 8601 UTC)

Otherwise the extraction into BuildDailyReportSkillPrompt and the omit-empty / section-budget structure are clean.

- §6 CI section: explicit OMIT in no-repo mode + the no-repo branch now states the CI gap inline; per-repo `/actions/runs` drops the `{default_branch}` placeholder and filters `conclusion=failure` + `created_at` client-side. - §8 Blockers: qualified as `always last unless the report is empty-day` and the empty-day fallback now explicitly says `do not append a Blockers line`. Resolves the contradiction a weaker model could synthesise both sides of. - Per-repo Shipped PRs: switch from `/repos/{owner}/{repo}/pulls?state=closed` to `/search/issues?q=repo:...+author:...+is:pr+is:merged+merged:>=...`. The list endpoint returned all contributors' closed-but-unmerged PRs (codex P1) and was unreliable on active repos with one-page pagination (eanzhao inline). Search form keys on merge time + author server-side. - §2 Shipped: add `/search/commits` queries in both modes so the "PRs merged AND commits authored" claim has a real source (codex P2). - Repo-mode Issues: add a `commenter:{username}` query so the "opened, closed, or commented on" schema isn't dropped by author-only filtering (codex P2). - Substitution variables: document `{username}`, `{iso_date}`, and `{owner}/{repo}` with concrete examples + a "do not leave literal placeholders" footer. The LLM was previously expected to infer the username substitution from "Primary GitHub username:" alone. - Tests: pin section 7 (Trend) slot in the order-pinning assertion (the schema position is load-bearing even when the section is optional). Pin the new commit query, the per-repo commenter query, the empty-day Blockers carve-out, the substitution-variable docs, and assert the dropped `/repos/{owner}/{repo}/pulls` + `{default_branch}` shapes are gone. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

eanzhao · 2026-04-27T13:39:22Z

All six issues from the previous review round are addressed in this update. Verified:

✅ Section 7 (Trend) test gap — now asserted, comment explains the rationale
✅ No-repo CI guidance — section 6 explicitly states "OMIT this section in no-repo mode", no-repo query list includes explicit skip comment
✅ {default_branch} removed — CI query now uses unfiltered /actions/runs?per_page=10 with client-side filtering; test asserts NotContain("{default_branch}")
✅ Blockers vs empty-day conflict — §8 now says "always last unless the report is empty-day"; empty-day rule says "do not append a Blockers line in this case"; test pins both phrases
✅ Closed-but-not-merged PRs — replaced /pulls?state=closed with search-API is:pr+is:merged+merged:>=; test asserts new query and NotContain on old one
✅ {username} substitution — new substitution-variable line documents {username}, {iso_date}, and {owner}/{repo}; test pins all three

Bonus fixes caught in this revision are also solid: shipped-commits query added for both modes, repo-mode issues now include the commenter case, final instruction blocks literal placeholder leakage.

LGTM — no remaining issues from my side.

eanzhao

The previous fix set looks mostly covered. I found one remaining issue around error handling semantics.

eanzhao · 2026-04-27T13:42:21Z

+
+        prompt
+            .AppendLine()
+            .AppendLine("If a query returns 4xx, 5xx, or empty, treat that source as zero for the affected section and continue — do not retry, do not fall back to invented data. Do not leave any literal `{username}` / `{iso_date}` / `{owner}/{repo}` placeholders in outbound URLs.");


P1: Do not collapse transport/auth failures into zero activity. nyxid_proxy currently returns error-shaped JSON for upstream failures ({"error": true, "status": 401/5xx, ...}), and #439 is specifically about the runner/LLM masking those as a successful "no activity" daily. This instruction preserves that failure mode by telling the model to treat 4xx/5xx as zero and continue, so a revoked GitHub grant or proxy outage can still produce No measurable activity with the agent left healthy. Please distinguish successful empty responses from tool/proxy failures: 2xx empty can be zero; 4xx/5xx/error-shaped tool results should be surfaced as a source failure (or at least not allowed to trigger the empty-day fallback).

…back eanzhao P1: the previous "4xx/5xx/empty → treat as zero" instruction silently masked transport/auth failures as healthy empty-day reports. A revoked GitHub OAuth grant or proxy outage would produce `No measurable activity in the last 24h.` with the agent left in a Running state — which is exactly the bug referenced in #439. Replace the catch-all with a tiered classification: - 2xx with an empty list → genuine zero data; section is omitted as before. - 4xx / 5xx / tool error envelope (`{"error": true, ...}`, revoked grant, proxy timeout) → source UNAVAILABLE, append `Source health: ...` footer below Blockers naming the failed sources. - Empty-day fallback (`No measurable activity in the last 24h.`) is now ONLY valid when EVERY source returned 2xx. If ANY source failed, emit at least the title plus the `Source health:` footer. Pin this distinction with three new test assertions plus a `NotContain` on the old wording, so future copy edits can't quietly regress the failure-handling behavior. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

eanzhao

The source-health fix addresses the P1 behavior, but it leaves one prompt-order conflict.

eanzhao · 2026-04-27T13:58:46Z

+            .AppendLine("5. Issues (≤4 lines) — issues opened, closed, or commented on by the user.")
+            .AppendLine("6. CI (≤3 lines) — failing GitHub Actions runs on the tracked repos. Best-effort and only feasible in repo-allowlist mode; OMIT this section in no-repo mode (the global search endpoints do not expose Actions run conclusions).")
+            .AppendLine("7. Trend (1 line, optional) — running totals vs the prior 24h, e.g. `Trend: shipped 3 (+1), reviews 5 (-2)`. Omit when the prior-window data could not be fetched.")
+            .AppendLine("8. Blockers (1 line, always last unless the report is empty-day) — `Blockers: <short list>` or `No blockers.` Auto-detect from: PRs >24h waiting on a review, CI red >2h, issues with labels `blocked` or `needs-info`.")


P2: This still says Blockers is always last except for empty-day reports, but the new source-health rule later requires Source health: at the very bottom after Blockers when any source fails. That gives the model two incompatible ordering rules in the exact case we care about for #439, and a weaker run can either drop the source-health footer or place it before Blockers. Please qualify this as something like always last among normal report sections unless empty-day or Source health is emitted, or make Source health an explicit final section/footer in the section order and pin that in the test.

eanzhao P2: §8 Blockers said "always last unless empty-day" but the source-health rule required `Source health:` "at the very bottom after Blockers" — two incompatible ordering rules in exactly the failure case #439 cares about. A weaker model could drop the Source health footer or place it before Blockers. Make Source health an explicit §9 footer in the schema list: - §8 Blockers: drop the "always last" qualifier; pin as `Position-locked at slot 8; the only section that may sit below it is the §9 Source health footer.` - §9 Source health: 1-line conditional footer, emitted only when at least one source returned non-2xx. When emitted, always the final line. - Empty-day fallback: explicitly forbid emitting §8 Blockers AND §9 Source health on a genuine empty day (every source 2xx with no items). If any source failed, the empty-day path is invalid — emit at least title + §9. - The "Source health — distinguish empty results from source failures" detail block is reframed as the rule for emitting §9, not a parallel ordering claim. Test pins: assert `9. Source health` in the order-pinning list, pin the new "Position-locked at slot 8" + "only section that may sit below it" wording on §8, and pin the strict empty-day forbidding both §8 and §9. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector Bot reviewed Apr 27, 2026

View reviewed changes

eanzhao commented Apr 27, 2026

View reviewed changes

		.AppendLine("- GET /repos/{owner}/{repo}/pulls?state=closed&per_page=20 // filter merged_at >= {iso_date} client-side for shipped")
		.AppendLine("- GET /repos/{owner}/{repo}/pulls?state=open&per_page=20 // filter author = {username} client-side for in flight")

Conversation

eanzhao commented Apr 27, 2026

Summary

Out of scope (follow-ups for #423)

Test plan

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

eanzhao commented Apr 27, 2026

1. Test gap: Section 7 (Trend) not asserted despite claiming full coverage

2. No CI query guidance when repo allowlist is empty

3. {default_branch} substitution variable undocumented

Uh oh!

eanzhao left a comment

Choose a reason for hiding this comment

Uh oh!

eanzhao Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

eanzhao commented Apr 27, 2026

4. (always last) for §8 Blockers conflicts with empty-day fallback

5. Per-repo ?state=closed shipped query pulls closed-but-not-merged PRs

6. {username} substitution not documented alongside {iso_date} / {owner}/{repo}

Uh oh!

eanzhao commented Apr 27, 2026

Uh oh!

eanzhao left a comment

Choose a reason for hiding this comment

Uh oh!

eanzhao Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

eanzhao left a comment

Choose a reason for hiding this comment

Uh oh!

eanzhao Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codecov Bot commented Apr 27, 2026 •

edited

Loading

3. `{default_branch}` substitution variable undocumented

4. `(always last)` for §8 Blockers conflicts with empty-day fallback

5. Per-repo `?state=closed` shipped query pulls closed-but-not-merged PRs

6. `{username}` substitution not documented alongside `{iso_date}` / `{owner}/{repo}`