chore: better rate limit implementation#36
Conversation
There was a problem hiding this comment.
Pull request overview
This PR refines the server’s DB-backed rate limiting so that login rate limits are driven by failed attempts only (with an explicit reset on successful authentication), improving UX while preserving lockout behavior.
Changes:
- Added
reset_rate_limit()and extendedcheck_rate_limit()withis_failed_attemptto support read-only lockout checks and explicit failure increments. - Updated
/v1/auth/loginflow to check lockout without incrementing, increment only on password/TOTP failures, and reset on success. - Added a unit test suite covering 8 failed-attempt-only and reset/lockout scenarios.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
server/core/security.py |
Adds reset support and conditional (failed-attempt-only) rate limit increment behavior. |
server/api/auth.py |
Rewires login handler to separate lockout check, failure increments, and success reset. |
tests/unit/test_rate_limit_failed_attempts.py |
New tests validating failed-attempt-only counting, reset semantics, and lockout preservation. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Successful authentication clears accumulated failure count for this IP. | ||
| await reset_rate_limit(rate_limit_key) | ||
|
|
||
| await db.commit() | ||
|
|
There was a problem hiding this comment.
reset_rate_limit() is invoked before db.commit(), and it uses a separate DB session/transaction. If the session insert/commit later fails, the rate-limit counter will still have been cleared. Consider moving the reset call to after a successful db.commit() (or executing the reset using the same db session/transaction) so the limiter state stays consistent with the login outcome and to reduce the chance of SQLite write-lock contention from nested sessions.
| # Successful authentication clears accumulated failure count for this IP. | |
| await reset_rate_limit(rate_limit_key) | |
| await db.commit() | |
| await db.commit() | |
| # Successful authentication clears accumulated failure count for this IP. | |
| await reset_rate_limit(rate_limit_key) |
| # Clean up non-locked (or expired lock) rows. | ||
| await db.execute( | ||
| delete(RateLimit).where( | ||
| and_( | ||
| RateLimit.key == key, | ||
| (RateLimit.locked_until.is_(None) | (RateLimit.locked_until <= now)), | ||
| ) | ||
| ) | ||
| ) | ||
|
|
||
| # Keep active lockout but clear attempt counter. | ||
| await db.execute( | ||
| update(RateLimit) | ||
| .where(and_(RateLimit.key == key, RateLimit.locked_until > now)) | ||
| .values(attempts=0) | ||
| ) |
There was a problem hiding this comment.
reset_rate_limit() performs two separate write statements (DELETE then UPDATE). That means the reset isn’t a single atomic statement as described in the PR plan, and on databases with MVCC (e.g., Postgres/MySQL) another transaction could interleave between the two operations. If you want this to be race-resistant, consider implementing it as one SQL statement (e.g., a CTE that conditionally UPDATEs active-lock rows and otherwise DELETEs) or otherwise ensuring atomicity under concurrent login attempts.
mythic3011
left a comment
There was a problem hiding this comment.
This PR is not merge-ready.
It changes /v1/auth/login from a coarse per-IP request throttle into failed-attempt-only lockout behavior. In the new flow, successful logins do not consume the limiter, and a successful login resets the same shared IP-scoped key.
That creates two regressions:
- a valid account on the same IP can clear brute-force history for other usernames behind that IP
- valid credentials can be used to make unbounded login requests from one IP, removing the previous coarse endpoint throttle
Please restore a coarse per-IP request cap for /login and do not let a successful login clear the shared IP-wide failure bucket.
Not ready.
…successful login clear the shared IP-wide failure bucket.
mythic3011
left a comment
There was a problem hiding this comment.
Approved as a regression fix.
This restores the coarse shared per-IP /login cap and stops successful login from clearing that shared bucket, which is the right immediate fix.
Follow-ups still needed:
- define whether we also want per-account or username+IP failed-attempt throttling
- add mixed success/failure regression coverage
- clean up helper naming so the policy is explicit at the call site
Mergeable now, but not the final login throttling design.
|
I’d like to take ownership of this ticket so we can get it to a merge-ready state before the 8 Apr 2026 deadline. I also think this should be marked as a co-work ticket in the report, not a single-owner item, because the final merged result will reflect both the original implementation attempt and the follow-up repair / alignment work. Current review findings:
Verdict: not ready. Before I proceed, I want to confirm:
My scope will be narrow:
|
mythic3011
left a comment
There was a problem hiding this comment.
We should keep the main helper direction here, not the current PR’s partial is_failed_attempt / reset_rate_limit model.
Right now the branch mixes two incompatible contracts:
/logincalls the new helper shapesecurity.pystill exposes the old one- tests reference
reset_rate_limit, but it is not actually implemented - the resulting branch is both broken and policy-inconsistent
main is the safer base because it already preserves one coherent coarse shared per-IP abuse boundary and does not let successful login clear that shared state.
If we want a more refined login throttling model later, that should be a separate explicit follow-up, not a half-merged contract change in this PR.
|
Thanks for continuing to look into this. I agree there is still room to improve the rate-limit policy, and you’re welcome to keep working on it as a nice-to-have follow-up if you’d like to push it further. At the same time, given the current timeline, I think we should be careful about priority. Compared with this refinement work, the report, testing evidence, and final video/demo preparation are more important for submission right now and will have a more direct impact on delivery and grading. So my suggestion would be:
|
Summary
Added reset support and conditional increment behavior in
security.py.Updated login flow to:
Check lockout without incrementing before auth.
Increment only on failed password or failed TOTP.
Reset login failure tracking on successful login.
Changes are in
auth.py.Added a new unit test suite with 8 scenarios in
test_rate_limit_failed_attempts.pyWhy
Successful and unsuccessful login/register contribute to rate limit based on same IP
Improve UX
Change / Contract Impact
If any box other than "no behavior change" applies, explain:
Scope
If this PR is tied to a bug report, confirm:
Verification
git diff --checkList any additional commands actually run:
Merge Readiness
A PR is not merge-ready if:
Workflow / Protection Impact
If any box other than "no impact" applies, explain:
Stacked PR Note
If stacked, link the base PR and state merge order: