Skip to content

Address review#7

Merged
s0nik42 merged 29 commits into
mainfrom
address-review
Apr 1, 2026
Merged

Address review#7
s0nik42 merged 29 commits into
mainfrom
address-review

Conversation

@magik6k

@magik6k magik6k commented Mar 23, 2026

Copy link
Copy Markdown
Collaborator

Description

Describe your changes in detail. What does this PR do and why?

Related Issues

Fixes # (issue number)

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update
  • Refactoring (no functional changes)
  • Performance improvement
  • Test update

Checklist

  • I have read the CONTRIBUTING guidelines
  • My code follows the project's code style
  • I have added tests that prove my fix is effective or my feature works
  • New and existing unit tests pass locally with my changes
  • I have updated the documentation accordingly
  • I have checked that there are no breaking changes (or documented them above)
  • My changes generate no new warnings

Additional Notes

Add any additional notes for reviewers here.

magik6k added 29 commits March 23, 2026 15:38
…dd validation script

- S3 multipart: implement ListParts with pagination (part-number-marker, max-parts)
- S3 multipart: AbortMultipartUpload now actually deletes part entries
- S3 multipart: add ListParts routing in backend server and frontend proxy
- gwcfg: restructure Advanced into navigable submenu with logical categories
  (Caching, Parallelism, Replication, Database Tuning, etc.)
- gwcfg: skip writing empty variables that were passed over in menus
- Add check-and-validate.sh for E2E S3 endpoint testing and benchmarks
Clients using --no-sign-request (aws cli) or anonymous access don't
send the x-amz-content-sha256 header. Treat empty/missing as
UNSIGNED-PAYLOAD instead of rejecting with 500.
AWS CLI rejects multipart responses without Content-Type: application/xml
and with leading whitespace before the XML declaration.
- Use curl for all GET readbacks (avoid aws cli conditional-GET ETag issue)
- Use curl for PUT uploads in benchmarks (avoid chunked-encoding auth issue)
- Fix jq KeyCount parsing (aws cli omits field when 0, use Contents length)
- Remove local variables from backgrounded loops (subshell scoping)
Replace broken FIFO semaphore with wait -n job throttling.
Results on test deployment: 31 obj/s serial → 2876 obj/s at par=1024.
…fields

The JSX was referencing indexStats.ReadRate / indexStats.WriteRate which
don't exist on the backend TopIndexStats struct (it returns cumulative
Reads/Writes counters). The EMA-smoothed readRate/writeRate state
variables were computed but never rendered.
selectWeighted held selectionLk while calling createGroup, which can
block indefinitely in ensureSpaceForGroup (waiting for offload space).
All other writers were blocked on selectionLk even when existing groups
had capacity.

Fix: release selectionLk before createGroup so other writers can use
existing groups. Re-acquire and refresh candidates after.

Also add --max-time to curl PUTs in check-and-validate.sh.
The 1-minute sleep loop in ensureSpaceForGroup blocked the blockstore's
single start() goroutine, which blocked ALL writes via the puts channel.
Even with selectionLk released, r.lk was held during the sleep, and the
blockstore writer was stuck.

Replace the infinite loop with a single attempt: if no offload candidate
exists, return ErrNoSpace. The blockstore start() loop already handles
Put errors by sending them to waiting HTTP handlers (500) and continuing.
Two changes to prevent all writes funneling into one group:

1. ribBatch.Flush resets currentWriteTarget so the next write cycle goes
   through the load balancer instead of sticking to the same group via
   tryPreferredGroup.

2. calculateScore/pickBest now use weighted random selection proportional
   to available space. Groups with 2x free space get 2x the probability
   of being selected. This naturally creates staggered fill levels —
   as a group fills, its selection weight drops, sending new writes
   elsewhere — avoiding thundering-herd finalization where multiple
   groups complete simultaneously.
Replace weighted-random (which converges all groups to the same fill
level → thundering herd) with modular-clock selection.

With N candidates sorted by fill ratio, candidate i targets fill level
(i+0.5)/N. The candidate furthest below its target receives the write.
This deterministically maintains maximally distinct fill levels:
e.g. with 4 groups they converge to ~12.5%, 37.5%, 62.5%, 87.5% fill,
so only one group finishes at a time.
All groups receive writes (weighted-random with base weight 1.0).
A modular-clock bias of up to 30% (maxClockBias) nudges the group
furthest behind its fill target to receive slightly more writes.

With N groups sorted by fill, group i targets (i+0.5)/N. The group
with the largest positive gap gets the full bonus; others get
proportional fractions. Groups ahead of target get no bonus.

This keeps throughput high (writes go everywhere) while gently
steering groups toward maximally staggered fill levels.
Base weight 1.0 for every group. Groups behind their clock target get
a linear bonus of up to 30% × lag. No max-gap normalization, no
clamping artifacts — just a simple proportional nudge.
…r distance

Sort by group ID (not fill) for stable target assignment. Use modular
arithmetic for clockwise distance: behind = (target - fill) mod 1.
Groups 0-180° behind target get a linear bonus up to 30%.
Groups at or ahead of target get no bonus.

This correctly handles the wrap-around: a new group at fill 0 with a
high target (e.g. 0.75) is classified as slightly ahead (just past
target on the clock), not massively behind.
calculateScore returns fill ratio (0.0 for empty groups). The candidate
filter 'score > 0' excluded groups at fill 0 — meaning newly created
groups never received writes through the normal selection path.

Fix: return -1 for 'can't accept write', filter on score >= 0.
- ETag quoting: all 7 ETag emission points now wrap values in double
  quotes per HTTP/S3 spec. Fixes aws cli conditional GET (If-Match)
  returning 412 Precondition Failed.

- Multipart GC: add DeleteExpired to S3ObjectIndex, implemented with
  ALLOW FILTERING query on expires_at. Region.StartCleanup runs hourly
  via fx lifecycle to remove abandoned multipart parts after 24h TTL.

- Config comments: add doc comments to all 21 previously uncommented
  config variables (YugabyteCql, YugabyteSql, S3API, S3Cql, Prometheus).
UploadPart now returns quoted ETags per HTTP spec. When the client
sends them back in CompleteMultipartUpload XML body, the quotes must
be stripped before parsing as CID.
…filter

- carlog/idx_wal_bench_test.go: assign idx.Close() return to _ (3 errcheck)
- rbstor/access_tracker_integration_test.go: compare with float64 not int
  (GetObjectPopularity returns float64)
- rbstor/parallel_test.go: use score >= 0 to include empty groups
  (calculateScore now returns -1 for can't-accept, 0 for empty)
Start RBS workers in unlink tests, isolate SQL and CQL state between runs, and make same-batch Put operations win over Unlink as the batch contract expects.
@s0nik42 s0nik42 merged commit 721cb0c into main Apr 1, 2026
5 checks passed
@s0nik42 s0nik42 deleted the address-review branch April 1, 2026 13:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants