Skip to content

ADR-134: embed_threshold gated apply from fire-score telemetry (deferred, data-gated) #123

Description

@aaronsb

The one remaining slice of ADR-134 (Accepted). Its design is adopted and the rest of the empirical-tuning loop shipped; this is Decision 4's embed_threshold apply, deliberately deferred because it is data-gated.

Why it's deferred (not just unbuilt)

tune-precision (#10, PR #119) flags mis-targeted ways and recommends "raise embed_threshold" — but only as a direction, with no principled magnitude. Computing the right threshold requires the scores of the fires that occurred, which way_fired did not log.

The fire-score telemetry that makes a principled derivation possible shipped (#11a, PR #120): fire_score is now logged on first-fires. But it is near-empty today — the apply cannot be derived or validated over an empty population. This is the same sequencing as token_positiontune-curves: land the telemetry, let it accumulate, then build the consumer.

Gate condition (when to pick this up)

Enough organic fire_score events have accumulated to characterize a mis-targeted way's firing-score distribution. Check:

grep '"fire_score"' ~/.claude/stats/events.jsonl | wc -l   # need a meaningful population
ways tune-precision                                         # confirm mis-targeted ways still exist

(As of 2026-06-12, fire-score logging had just started — only this session's self-test fires exist.)

What to build

A gated embed_threshold apply, joining fire_score (placement scores) with tune-precision's off-class classification:

  1. For a mis-targeted way, find the smallest embed_threshold that would have excluded its off-class fires while keeping its on-class fires (a per-way recall/precision tradeoff over the observed score distribution).
  2. Report the suggestion with the sample size it derives from (ADR-134 Negative: don't codify one user's work distribution).
  3. --apply rewrites the embed_threshold: in way frontmatter as a reviewable, revertible git diff.

Hard rules (ADR-134 Decision 4 + Negative)

  • Never auto-apply vocabulary changes — they reshape the embedding neighborhood and stay authorial.
  • Only embed_threshold. The half_life apply already ships in ways tune-curves --apply — do not rebuild it.
  • Cross-cutting ways (high spread in tune-precision) must never be auto-narrowed; the apply should refuse or skip them.

References

Tracked as task #11b in this session's plan. Resume per the github way: branch → PR → code-reviewer → merge.

Metadata

Metadata

Assignees

No one assigned

    Labels

    adr:acceptedADR accepted; this issue tracks implementationarea:waysWays CLI, matching, steering layereffort:mediumA few sittingsenhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions