Skip to content

PRD: Goodbye gRPC — plain net/http rewrite with DB indexing, de-caching, TDD contract suite, and observability #112

@SyniRon

Description

@SyniRon

Supersedes #95 (investigation complete — this is the resulting plan). Renders #94 moot once Phase 3 lands. Verification powered by the analytics work from #92.

Problem Statement

The API maintains two transports — a gRPC server and an HTTP/JSON gateway that translates into it — for an audience that uses exactly one: seven days of request analytics show zero non-loopback gRPC traffic; every known consumer is HTTP-only. The proto/buf/codegen toolchain, the second listener, and the generated gateway exist solely to serve the API's own internal loopback.

Meanwhile the actual production pain is elsewhere:

  • Every cache-miss request costs 4.4–5.4 seconds, because every milpac lookup runs an aggregation over the forum's largest table (585k rows) to compute last-post dates — unindexed. The Redis response cache is not an optimization; it is life support masking this, at the cost of an extra service, an invalidation poller, and stale-read semantics.
  • The full Past Members roster response is 30.4MB of JSON squeezed through a 20MB internal gRPC ceiling — it survives today only because protobuf is more compact than the JSON it becomes (~75% of the fuse, and that roster only grows).
  • There is no error monitoring (no panic capture, no aggregation) and no metrics beyond grep-able request logs.
  • The Keycloak lookup surface is dead weight: the auth path it served no longer exists (per the domain glossary, it is "on the chopping block").
  • Contributors must install buf/protoc toolchains and run codegen to touch any endpoint; CI builds download and run the same machinery on every push.

Solution

Rewrite the API as a single-listener, plain net/http JSON service (stdlib mux, no framework) that preserves every public URL and response shape, then delete the gRPC server, the gateway, the proto/buf toolchain, and the second port. In the same program of work, in strictly sequenced phases:

  1. Observe first — Sentry error capture wraps the existing stack before anything changes; request logs gain a duration field.
  2. Index the database (admin-applied, measured: the hot aggregation drops from ~4.2s to ~30ms) so that…
  3. …the Redis response cache becomes deletable — removing Redis, the cache package, and the table-update poller entirely.
  4. Rewrite test-first: a golden contract corpus recorded from the current stack is the red suite; handlers are implemented to green; the public contract is enforced by semantic JSON comparison forever after.
  5. Delete the toolchain and land Prometheus metrics, Cache-Control headers, and refreshed docs/ADRs.

The Keycloak lookup route, RPC, and response fields are removed as part of the rewrite. Consumers see identical URLs, bodies, and error shapes — except for an explicit, documented list of deliberate breaks.

User Stories

  1. As an API consumer, I want every public URL I use today to keep working unchanged after the rewrite, so that I never have to modify my integration.
  2. As an API consumer, I want JSON response bodies that are semantically identical to today's (same field names, same types, same null/empty conventions), so that my parsers keep working without edits.
  3. As an API consumer, I want error responses to keep their current shape and status codes, so that my error handling and retry logic keep working.
  4. As an API consumer, I want a cache-miss profile lookup to return in tens of milliseconds instead of ~4.4 seconds, so that my bot commands feel instant.
  5. As an API consumer, I want the full Past Members roster to keep working as it grows, so that my archival tooling doesn't one day hit an invisible internal size ceiling.
  6. As an API consumer, I want Cache-Control headers on read endpoints, so that I know how fresh the data is and can poll respectfully instead of guessing.
  7. As an API consumer fetching rosters, I want both the enum-name and numeric roster path forms to keep working, so that whichever form I integrated against remains valid.
  8. As a tickets consumer, I want repeated query-parameter filters and both snake_case/camelCase parameter spellings to keep binding the way they do today, so that my list queries keep filtering correctly.
  9. As an S1 staff member, I want the uniforms roster view to keep serving the same shape, so that the uniforms tool keeps rendering without changes.
  10. As a Discord-integration operator, I want member lookup by Discord ID to behave identically, so that account linking keeps resolving members.
  11. As the maintainer, I want the gRPC server, gateway codegen, proto files, and buf toolchain deleted, so that there is no infrastructure serving an audience of zero.
  12. As the maintainer, I want a single public listener and a single port, so that deployment, reverse-proxy config, and the mental model all shrink.
  13. As the maintainer, I want the dependency graph to shrink by roughly a hundred modules, so that dependabot noise, build times, and audit surface all drop.
  14. As the maintainer, I want Redis and the cache subsystem deleted once indexes make them redundant, so that there is one fewer service to run, monitor, and explain.
  15. As the maintainer, I want each phase shipped as an independently deployable, independently revertible change, so that any regression is attributable to exactly one variable.
  16. As the maintainer, I want panics and 5xx errors captured in Sentry with release tagging, so that I find out about production breakage from an alert instead of a user report.
  17. As the maintainer, I want Sentry events tagged with key ID and route — but never the bearer token — so that I can attribute errors without leaking credentials.
  18. As the maintainer, I want Prometheus metrics (request counts by route/status/key, latency histograms by route, Go runtime stats) on an internal-only listener, so that I can build dashboards without exposing operational data publicly.
  19. As the maintainer, I want per-key request counters, so that future rate-limiting decisions are based on evidence about who calls what, how often.
  20. As the maintainer, I want the throwaway request-log analytics retired once Prometheus lands, so that request-volume and latency questions are answered by purpose-built instrumentation instead of log scraping.
  21. As the database admin, I want the index DDL as an idempotent script I apply manually, so that the API never runs DDL against the forum's schema and I stay in control of when it happens.
  22. As the database admin, I want that script kept in-repo with a documented re-apply procedure, so that if a forum add-on upgrade rebuilds tables and drops the indexes, restoring them is one command.
  23. As a future contributor, I want endpoints defined as plain Go handlers with plain structs, so that adding or changing an endpoint requires no codegen toolchain, no proto knowledge, and no special setup.
  24. As a future contributor, I want a golden contract test suite, so that I can refactor confidently knowing any observable contract change fails CI.
  25. As a future contributor, I want datastore tests that run against a real MariaDB with realistic fixtures, so that query changes are verified against actual SQL behavior instead of mock expectations that assert my own code back at me.
  26. As the maintainer, I want EXPLAIN-plan assertions in the test suite, so that a future query or schema change that silently reintroduces a full-table scan fails a test instead of a production latency budget.
  27. As the maintainer, I want the dead Keycloak lookup surface (route, response fields, datastore method) removed, so that dead code stops imposing maintenance and contract burden.
  28. As the maintainer, I want every deliberate contract break enumerated in one documented list (Keycloak fields gone, 405 replacing the 501 quirk, dropped gateway artifacts), so that the diff between "preserved" and "intentionally changed" is auditable.
  29. As the maintainer, I want a hand-owned OpenAPI 3.1 document that CI validates against the golden corpus, so that reference docs can never drift from observed behavior — a stronger sync guarantee than generation provided.
  30. As the maintainer, I want ADRs recording why the split-process design and the response cache were retired, so that the architectural history stays honest and navigable.
  31. As the maintainer, I want CI to build and test without downloading buf or running codegen, so that pipelines get faster and simpler.
  32. As a key holder among the top consumers, I want a heads-up before cutover with the deliberate-breaks list, so that I can sanity-check my integration against it.
  33. As a newer developer adding an endpoint, I want CI to fail with a precise message when the spec and the implementation disagree (route missing from spec, wrong field type, documented-but-nonexistent operation), so that keeping docs accurate requires no tribal knowledge or discipline.

Implementation Decisions

Phasing — one behavior change per deploy, each revertible:

Phase Ships Verification
0. Observe Sentry (errors-only) wrapping the existing stack; duration= added to request log lines as temporary instrumentation (measuring stick for Phases 1–2, no format-continuity obligation) events arriving; latency baseline captured from logs
1. Index MariaDB integration harness + EXPLAIN red→green tests; index script applied manually by admin measured latency delta in logs
2. De-cache cache middleware removed from the chain; after soak, the cache package, Redis, the cache-manager goroutine, and table-update polling are deleted latency holds without cache; roster pollers watched specifically
3. Rewrite golden corpus (red) → types package → handlers per route (green) → cutover to single listener; Prometheus; Cache-Control; full Sentry wiring golden suite green against the new stack
4. Delete proto/buf/gRPC/gateway/second port/CI steps/build tooling; docs + ADRs build green; golden suite green; dependency tree shrinks

Stack: stdlib net/http with the Go 1.22+ pattern-routing mux. No web framework. Evaluated and rejected: huma (its defaults — $schema injection, RFC 9457 errors — fight a frozen legacy contract; its docs/validation value is exactly what the contract neutralizes), chi (adds nothing at this route count), oapi-codegen/ogen (swaps one codegen toolchain for another), connect-go + vanguard (wire-perfect but keeps protos/buf and adds a pre-1.0 dependency — contradicts the deletion goal).

Types package as the new domain model. Hand-written structs replace the generated proto types everywhere, including the datastore interface and its implementations (generated Go field names match what a human would write, so the migration there is import surgery plus enum constants and slice allocation). The structs reproduce the current wire conventions, all verified empirically against the live stack:

  • lowerCamelCase JSON names; 64-bit integers serialized as JSON strings (,string tags — proven byte-identical); 32-bit integers as numbers.
  • Enums as name strings via custom marshalers whose zero value emits the _UNSPECIFIED name (zero-safe by construction).
  • Emit-everything semantics (the gateway runs EmitUnpopulated): no omitempty anywhere; unset nested messages as null (pointer fields); empty collections as []/{} (allocation discipline enforced by goldens).
  • Integer-keyed maps serialize with string keys (stdlib already does this).

Compatibility bar is semantic JSON equality, not byte equality. The protobuf runtime deliberately randomizes protojson whitespace per build — today's API already changes byte output on every deploy, so no client can be byte-sensitive. Golden comparison parses and canonicalizes.

Error contract preserved: the gRPC-status JSON shape ({"code": <grpc-code-number>, "message": ..., "details": []}) with the existing code→HTTP-status mapping; handler-specific message strings preserved verbatim (including the ones that leak parse-error text — they're frozen behavior). Auth failures stay exactly as they are live today: plain-text two-tier 401s from the middleware (scheme error message vs. generic Unauthorized), no WWW-Authenticate header. Unknown paths under the API prefix return the JSON 404 body, not the stdlib text 404.

Request-side leniency preserved: enum path parameters accept name or number; repeated query parameters bind by key repetition; query keys accepted in both snake_case and camelCase; unknown parameters ignored; unbound message fields remain query-bindable on profile routes. The by-ID profile lookup keeps its current semantic (the path value matches the milpac relation key, which is not the forum user id) — frozen as-is, whatever its name implies.

Deliberate breaks (the complete list):

  • Keycloak: lookup route removed (becomes the standard JSON 404); keycloakId fields removed from both profile shapes; the datastore lookup method deleted. Golden comparison applies one documented transform (strip the field) rather than mutating recorded truth.
  • Wrong-method now 405 + Allow (was 501 old-stack, 404 in un-pinned new code). Affects only unsupported write attempts against a read-only API — no legitimate consumer impact. Ruled at Phase 3 tracer: GET /api/v1/milpacs/ranks end-to-end on the new net/http stack #125: HEAD stays a supported read verb (200 on valid routes, no body — old stack answered 501 to HEAD too); 404 is reserved for genuinely unknown routes; the 405 body keeps the old stack's wrong-method JSON verbatim ({"code":12,"message":"Method Not Allowed","details":[]}) — only the HTTP status and the Allow: GET, HEAD header change. Rationale: the API is read-only by published contract — no write endpoints exist and the docs say so, so any POST/PATCH/etc. was never a supported call and there is no legitimate consumer whose error-handling the change can break; the behavior-neutral cutover guarantee covers the published GET/HEAD surface, which is untouched. That removes the only reason to shim the old stack's 501 and frees the correct code: 405 preserves the "route exists, method doesn't" signal that 404 would discard, and Allow answers a mistaken-but-legitimate client in one round trip; public docs mean 405 leaks nothing. When write endpoints arrive, those routes advertise their own Allow and inherit this default — forward-compatible pattern, not a one-off. Pinned new-stack-only (rest/rest_test.go): the golden battery replays against the old stack too, and a shared 405 case would go red there.
  • Gateway artifacts dropped: Grpc-Metadata-* response headers (today visible only on tickets routes; the cache already strips them elsewhere), X-HTTP-Method-Override, the form-POST-to-GET fallback, legacy percent-encoding path unescaping.
  • Invalid enum values in queries return 400 instead of being silently dropped.
  • X-Cache header disappears with the cache.
  • Scope check now precedes request binding: a valid key lacking the route's scope gets the 403 even when the request is also malformed (old stack: every binding 400 — path type-mismatch, query parse, ParseForm syntax — fired in the gateway before RequireScope, which lived inside the RPC bodies, so the wrong-scoped caller saw the 400). Ruled at Phase 3: profile lookup routes — id, username, discord, gamertag (#126) #151/Phase 3: tickets routes — list, by id, by ref, messages, categories (#129) #152 (ratified 2026-06-06): the new stack layers uniformly 401 → 403 → route semantics. Rationale: a wrong-scoped key cannot use the route either way; answering the scope error first leaks nothing about request-shape validation to unauthorized callers and keeps the auth layering a single ordered contract. One deliberate exception: GET /api/v1/tickets/ref/messages is an exact-path parity shim reproducing a gateway-level 400 that predated scope in the old stack — it stays scope-independent (auth 401 tiers still precede it). Pinned new-stack-only in both PRs' tests.
  • Deterministic query binding where the old gateway was map-order nondeterministic (never golden-pinnable). Ruled at Phase 3: profile lookup routes — id, username, discord, gamertag (#126) #151/Phase 3: tickets routes — list, by id, by ref, messages, categories (#129) #152 (ratified 2026-06-06), one protocol across both binders: values group snake → camel → bracket-folds (sorted by raw key); every provided value still parses (any malformed value 400s with the gateway's frozen wire text, as the old stack deterministically did); when all parse, the last value — i.e. camelCase when both spellings are present — wins scalars. Deterministic same-key repetition on scalars keeps the old 400 (too many values); bracket keys keep the gateway's valuesKeyRegexp fold. Rationale: the only territory changed is what the old stack itself answered nondeterministically; every deterministic old behavior in the family is preserved verbatim and pinned.
  • Path-cleaning 307 on unclean paths (ruled 2026-06-06, wave-2 review, Phase 3: position groups, position search, and AWOL routes #128): the new stack's ServeMux cleans request paths before matching, so GET .../position/search/A//B (or A/../B) answers 307 Temporary Redirect to the cleaned path where the old gateway's ** glob matched the raw segments and served the 200 search. Accepted with the warts fixed: the redirect carries the contract JSON body (not net/http's HTML; codeUnknown/"Temporary Redirect", mirroring the 405 fallback precedent) and meters under the bounded / catch-all label with the key id; Location is byte-identical to the mux's own redirect. Rationale: realistic clients never emit unclean paths — the only consumers affected are malformed callers, and one extra round trip with an honest JSON body beats an invasive pre-mux parity intercept maintained forever. Pinned both ways (TestNewStack_SearchUncleanPathIs307WithJSONBody, TestMetrics_CleanPath307MetersUnderCatchAllWithKeyId); implementation rest/redirect.go.
  • Encoded path separators (%2F) — now segment data, not routing separators (ruled 2026-06-06, pre-7Cav/api#131 triage). The old gateway percent-decoded paths before routing, so %2F was routing-equivalent to a literal /; the new stack's ServeMux matches the escaped path (RFC-correct), so %2F stays data within a single segment and binds into the path value. Non-wildcard sibling of the Phase 3: position groups, position search, and AWOL routes #128 search-segment break. Affects only requests that percent-encode the route's own separators — a correctly-templated client encodes only the parameter value, which contains no separator. Rationale: the %2F-as-separator equivalence was an accident of the gateway's decode-before-route, in the same dropped-artifact family as legacy percent-encoding unescaping; the one row that could have had a real consumer (tickets, below) is closed by ownership rather than log evidence. Divergent shapes, probed against both stacks 2026-06-06 (pinned new-stack-only in TestNewStack_EncodedSlashStaysSegmentData, rest/rest_test.go — the corpus replays the old stack, which produces the pre-break behavior):
    • GET /api/v1/roster/combat%2Ffoo: 404 {"code":5,"message":"Not Found","details":[]} → 400 {"code":3,"message":"type mismatch, parameter: roster, error: combat/foo is not valid","details":[]} (type-mismatch on the bound roster value).
    • GET /api/v1/milpacs/profile/username/john%2Fdoe: generic 404 Not Found → 404 {"code":5,"message":"no profile found for username: john/doe","details":[]} (bound-value message).
    • GET /api/v1/tickets/1%2Fmessages: 200 (the old stack served the messages list via decode-before-route) → 400 {"code":3,"message":"type mismatch, parameter: ticket_id, error: strconv.ParseUint: parsing \"1/messages\": invalid syntax","details":[]}. Migration: use the literal form /api/v1/tickets/{id}/messages, unaffected on both stacks. Sole scoped consumer confirmed and ruled breakable at cutover: read:tickets has a single key holder (the maintainer), so the consumer set for the encoded form is empty by ownership, not by log evidence.
    • GET /api/v1/tickets/ref%2Fmessages: 400 on both stacks; the interpolated parse text changes (parsing "ref"parsing "ref/messages") — same family, the whole segment now binds.
    • GET /api/v1/foo%2Fbar (unknown path): identical JSON 404 on both stacks — no divergence outside bound segments; pinned as the family's no-divergence case.

Database indexes (admin-applied, idempotent, in-repo as the source of truth; derived from measurement on the mirror — the composite turns the per-request post-date aggregation into a loose index scan, 4,226ms → 28ms):

ALTER TABLE xf_post ADD INDEX IF NOT EXISTS user_id_post_date (user_id, post_date);
ALTER TABLE xf_nf_rosters_service_record ADD INDEX IF NOT EXISTS idx_relation_id (relation_id);
ALTER TABLE xf_nf_rosters_user_award ADD INDEX IF NOT EXISTS idx_relation_id (relation_id);
ALTER TABLE xf_nf_rosters_user ADD INDEX IF NOT EXISTS idx_user_id (user_id);

The existing aggregation query stays as written (the indexed derived table measured faster than a correlated rewrite). The API never executes DDL; if add-on upgrades ever clobber the indexes, the long-term home for re-application is the ApiKeyManager add-on's schema step.

Cache removal: middleware first (one-line revert), deletion after soak. Removing it also removes the response-byte cache, the path-keyed Redis storage, the table-update poller, and Redis itself from the deployment — the response cache is Redis's only consumer.

Observability: Sentry (cloud) for errors only — panic recovery middleware plus 5xx reports emitted from the single error-writer choke point; release tagging reuses the existing build-time version injection; events tagged with key id and route pattern; bearer material never sent. Prometheus via the standard client on an internal-only listener that the reverse proxy never routes and compose never publishes: request counter labeled route/method/status/key-id, duration histogram labeled route/method only (cardinality discipline), default Go runtime collectors. Route labels come from the stdlib mux's matched-pattern field; key-id reaches the outer metrics middleware via a context label-holder filled by the auth middleware. Middleware order: sentry → metrics → auth (with per-route scope requirements) → gzip → mux.

Polling pressure: read endpoints send Cache-Control: max-age as the cooperative freshness signal. Rate limiting is explicitly deferred (see Out of Scope) but the per-key counters land now to inform it.

Docs: a hand-written OpenAPI 3.1 document becomes the reference spec — one document with milpacs/tickets tags, seeded by mechanically converting the two generated Swagger 2.0 files, then corrected to observed behavior (real error bodies, the plain-text 401 tiers — things the generated spec already misdescribed) with the Keycloak surface removed. The golden harness makes the spec executable: every recorded interaction is validated against it (an OpenAPI request/response filter library, test-only dependency), with two-way coverage assertions — every spec operation has at least one golden, every golden route exists in the spec. A spec that drifts from behavior fails CI with a message naming the operation and field; spec quality (rich descriptions vs. lazy blobs) remains a review concern, optionally assisted by an OpenAPI linter. The docs UI keeps serving at unchanged URLs (a single-file modern renderer is an optional cosmetic swap); the served spec templates its version from the build-time version variable, replacing the CI proto-edit step. The wire conventions (camelCase, always-emit, 64-bit-as-string, enum names) are adopted as house style for future endpoints, so the API stays uniform and the shared helper types keep new handlers convention-correct by default. Contributor docs gain an add-an-endpoint checklist: struct → handler → route registration → spec operation block → goldens, with the spec steps CI-enforced. Precondition: confirm no consumer generates client code from the served spec before retiring the 2.0 format (if any does, a frozen 2.0 alias is kept alongside). The domain glossary's "proto files are the contract" statement is updated to point at the types package, the validated spec, and the golden corpus. New ADR supersedes the split-process and plaintext-dial ADRs; the cache ADR is superseded in Phase 2; the scope-auth ADR is unchanged.

Testing Decisions

A good test here asserts externally observable behavior at the highest available seam — status code, headers a client could depend on, and semantically compared JSON — never marshaling internals, never SQL strings.

Seams, highest first:

  1. HTTP edge (existing seam — httptest-style, as the gateway auth tests already do). The golden contract corpus: the current stack is mounted in-process over a seeded fake datastore and a recorded request battery (every route × happy/edge/error/auth cases × both enum forms × repeated-filter combinations) produces canonicalized goldens. These are the red suite for the entire rewrite — the TDD outer loop. Each new handler is implemented until its goldens pass; the suite then lives on permanently as the contract regression net. Comparison is semantic (parse → canonicalize → diff), with the Keycloak strip as the single documented transform and byte-diff available as informational output. The same replay loop validates every interaction against the OpenAPI document and asserts two-way route coverage, making the spec an executable artifact rather than parallel prose.
  2. Unit seam (pure functions — TDD inner loop): enum marshalers (zero-value naming both directions), 64-bit string-integer types, binding helpers (name-or-number enums, repeated params, dual-spelling keys, lenient bools), the error writer (shape, status mapping, plain-text 401 tiers). Red-green-refactor per component.
  3. SQL seam (the one new seam, proposed at the lowest level because nothing higher can verify it): a dockerized MariaDB integration harness with schema + fixture data shaped like the forum tables. It carries (a) datastore behavior tests that replace the existing sqlmock tests — which assert the implementation's own SQL back at itself — and (b) EXPLAIN-plan assertions: red is a full-scan plan on the hot aggregation, green is the loose index scan. CI gains a MariaDB service container.

Prior art: the gateway auth middleware tests (HTTP-edge style to extend), the fake-datastore pattern from the gRPC handler tests (reused to seed goldens), the sqlmock tickets tests (pattern being retired in favor of the harness). The existing auth/scope unit tests survive with adapted types; proto-coupled tests (deprecation contract, generated-spec assertions) retire with the toolchain they guard.

Out of Scope

  • Rate limiting — deliberately deferred until the per-key Prometheus counters have accumulated evidence; lands as middleware later.
  • ETag / conditional requests — possible future answer to aggressive pollers; not now.
  • Prometheus server / Grafana provisioning — this PRD delivers the metrics endpoint only; scrape infrastructure is the operator's.
  • Tickets total_message_count removal — stays on its own pre-announced deprecation schedule.
  • Position-search fuzzy behavior — currently returns empty for at least some plausible queries; frozen as-is by the goldens. Whether that's a bug is a separate issue.
  • Any schema change beyond the four indexes — no redesign of forum-owned tables.
  • Self-hosted error monitoring — Sentry cloud free tier; the SDK speaks a portable protocol if this ever changes.
  • Docs friendliness pass — richer descriptions, examples, landing copy for the docs page. The hand-owned spec makes this an ordinary PR whenever; explicitly later.
  • Health endpoint, client SDKs, consumer notification tooling.

Further Notes

Evidence base (all measured, not estimated): 7 days of analytics — 3,660 requests, zero non-loopback gRPC, five keys ≈ 99% of traffic. Mirror DB measurements: post-date aggregation 4,226ms → 28ms with the composite index (loose index scan); AWOL 4,211ms → 198ms; single-profile record preloads 44ms → ~0ms; full-stack cache-miss A/B: single profile 4.36s → 34ms, combat roster 5.35s → 1.02s. Combat roster payload 15.7MB (1.84MB gzipped); Past Members 30.4MB against the 20MB internal ceiling. Wire-format reproduction proven by live diff of the real stack's output against plain-struct marshaling (semantic equality on every case; the only byte deltas are whitespace and map-key ordering, both invisible to parsers).

Preconditions for Phase 1: verify the production index state first (SHOW INDEX on the four tables — the mirror had drifted; production is believed unindexed). Confirm DDL privilege for the applying user.

Sizing: ~7,400 LOC deleted, ~1,600–2,000 added; roughly two focused weeks across the five phases.

Issue relationships: supersedes #95. #94 (gateway dial deprecations) becomes moot when Phase 3 lands and should be closed as superseded then. The ad-hoc analytics built for #92 already served its purpose (answering the gRPC-traffic question); the duration-instrumented logs are the temporary measuring stick for Phases 1–2, after which Prometheus owns metrics and the log scraping retires with the old stack — the new stack's request logging is unconstrained by the old format.

Metadata

Metadata

Assignees

Labels

7cavchoreRoutine maintenance, cleanup, or tech-debt removalrefactorCode restructure without functional change

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions