Skip to content

docs: annotate the tightened scoring ruler (scores NOT comparable across rulers)#1034

Merged
100yenadmin merged 1 commit into
mainfrom
docs/scoring-ruler-rigor-annotation
Jun 19, 2026
Merged

docs: annotate the tightened scoring ruler (scores NOT comparable across rulers)#1034
100yenadmin merged 1 commit into
mainfrom
docs/scoring-ruler-rigor-annotation

Conversation

@100yenadmin

Copy link
Copy Markdown
Member

Owner-flagged stable-checkpoint hygiene: current scores read lower than historic because the scoring ruler got more rigorous this cycle, not because quality regressed — and that wasn't annotated anywhere human-readable.

What this adds

Why it matters

We're beyond a normal release window; stable, differentiable checkpoints are critical. The sc_/lc_ stamping already exists in scores_db — this makes the human framing match, so past/future runs and versions stay distinguishable and the scorer can serve as the autonomous build-and-improve feedback loop it's designed to be.

Docs-only. 🤖 Generated with Claude Code

…mparable across rulers

The 2026-06 cycle materially tightened the scoring ruler (feature-engagement coverage scorer #1018,
acts felt-shape #1001/#1002, betrayal un-inversion #999, romance gate #997, dm_advanced_time unmask
#1024, gate-severity accuracy #1030), so a run scores LOWER under sc_d4b93982763a/lc_d7fcfddd5bf7
than under the v1.0.4 rulers — BY DESIGN (the scorer is a tightening feedback loop). Document the
ruler-version mechanism + history in SCORING.md §0 and annotate it in the v1.0.5-rc1 CHANGELOG, so
current numbers are never mis-compared to historic ones (every scores_db row is fenced by
scoring_config_version/lens_config_version). Stable-checkpoint hygiene per owner.
@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown

Warning

Review limit reached

@100yenadmin, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 2 hours, 53 minutes, and 2 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: d815b1d6-b128-4615-80f7-ad65411f84f1

📥 Commits

Reviewing files that changed from the base of the PR and between 540eff2 and 47ad054.

📒 Files selected for processing (2)
  • CHANGELOG.md
  • qa/SCORING.md

Comment @coderabbitai help to get the list of available commands and usage tips.

@100yenadmin 100yenadmin merged commit 522ec0c into main Jun 19, 2026
20 checks passed
100yenadmin added a commit that referenced this pull request Jun 19, 2026
… annotation (#1034) (#1035)

Checkpoint marking the Guiding Bolt SRD duration fix (found by running the combat-sprint, proven
RED->GREEN, adversarially reviewed) + the ruler-version annotation. Still NOT a GA — mech remains
below the 4.5 bar; story BG-caliber + satisfaction green.

Co-authored-by: Eva <arncalso@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant