Summary
The size_factor term in the weighted risk score is documented as a "deliberately tiny tie-breaker" that for "even a 10k-line file" contributes "well under one churn-point", but the actual magnitude is far larger than those comments imply.
Location
src/vcs/score.rs:36 (module doc: "deliberately tiny tie-breaker")
src/vcs/score.rs:121-123 (inline comment + size_factor computation)
Evidence
// Size is a tiny tie-breaker: squared log over 100 keeps even a
// 10k-line file contributing well under one churn-point.
let size_factor = ln1p(input.sloc as f64).powi(2) / 100.0;
size_factor enters base with coefficient 1.0 (+ size_factor), whereas the recency-churn "point" enters as 0.30 * ln1p(churn). Computed magnitudes:
| sloc |
size_factor |
| 100 |
0.213 |
| 300 |
0.326 |
| 1 000 |
0.477 |
| 10 000 |
0.848 |
| 50 000 |
1.171 |
For comparison, at the test baseline (churn_recent = 50) the dominant recency-churn contribution is 0.30 * ln1p(50) = 1.18, the long-churn term is 0.05 * ln1p(200) = 0.27, the entropy factor is 0.15, and the baseline fix term is 0.069.
So:
- At sloc=300 the size term (0.326) already exceeds the entire entropy factor (0.15), the long-churn term (0.27), and is ~5x the fix term.
- At sloc=10k the size term (0.848) is ~72% of the dominant recency-churn term — not "well under one churn-point."
- At sloc=50k the size term is 1.17, i.e. it exceeds one whole unit, directly contradicting the comment.
Because the size term is inside base, it is further amplified by the multiplicative (1.0 + dev_bonus + new_file_bonus) factor (up to 1.50x), so a large new file with >=9 developers sees a size contribution of ~1.27.
Expected Behavior
The doc comments should describe the term's true weight, or the formula should bound/down-weight size_factor so it matches the stated "tiny tie-breaker" intent.
Actual Behavior
The comments understate the size term by roughly an order of magnitude relative to its real contribution, which could mislead a maintainer tuning the formula (and a future reader who trusts the "tie-breaker" framing may not realize a large file materially shifts its own rank).
Impact
Maintainers reasoning about the risk-score weighting from the in-source documentation. The score remains ordinal, so this is a documentation-accuracy / formula-design concern rather than an output-corruption bug.
Resolution
Corrected score.rs comments: size_factor enters base at coefficient 1.0 with real magnitudes (~0.85 at 10k SLOC, >1.0 past ~50k), not a tiny tie-breaker. Formula unchanged. Commit a7e35ad.
Summary
The
size_factorterm in the weighted risk score is documented as a "deliberately tiny tie-breaker" that for "even a 10k-line file" contributes "well under one churn-point", but the actual magnitude is far larger than those comments imply.Location
src/vcs/score.rs:36(module doc: "deliberately tiny tie-breaker")src/vcs/score.rs:121-123(inline comment +size_factorcomputation)Evidence
size_factorentersbasewith coefficient1.0(+ size_factor), whereas the recency-churn "point" enters as0.30 * ln1p(churn). Computed magnitudes:For comparison, at the test baseline (
churn_recent = 50) the dominant recency-churn contribution is0.30 * ln1p(50) = 1.18, the long-churn term is0.05 * ln1p(200) = 0.27, the entropy factor is0.15, and the baseline fix term is0.069.So:
Because the size term is inside
base, it is further amplified by the multiplicative(1.0 + dev_bonus + new_file_bonus)factor (up to 1.50x), so a large new file with >=9 developers sees a size contribution of ~1.27.Expected Behavior
The doc comments should describe the term's true weight, or the formula should bound/down-weight
size_factorso it matches the stated "tiny tie-breaker" intent.Actual Behavior
The comments understate the size term by roughly an order of magnitude relative to its real contribution, which could mislead a maintainer tuning the formula (and a future reader who trusts the "tie-breaker" framing may not realize a large file materially shifts its own rank).
Impact
Maintainers reasoning about the risk-score weighting from the in-source documentation. The score remains ordinal, so this is a documentation-accuracy / formula-design concern rather than an output-corruption bug.
Resolution
Corrected score.rs comments: size_factor enters base at coefficient 1.0 with real magnitudes (~0.85 at 10k SLOC, >1.0 past ~50k), not a tiny tie-breaker. Formula unchanged. Commit a7e35ad.