Skip to content

feat(otel): replace legacy trace_id interceptors with W3C OTel#49

Open
anson-lee-sl wants to merge 8 commits into
chore/deprecate-newrelicfrom
feature/grpc-w3c-otel
Open

feat(otel): replace legacy trace_id interceptors with W3C OTel#49
anson-lee-sl wants to merge 8 commits into
chore/deprecate-newrelicfrom
feature/grpc-w3c-otel

Conversation

@anson-lee-sl

@anson-lee-sl anson-lee-sl commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Built on top of chore/deprecate-newrelic — the newrelic files are already gone, so this commit only touches the sentry path and Kitex.

gRPC server

  • Delete plugins/grpc/interceptors/trace_id.go (replaced by otelgrpc.NewServerHandler() StatsHandler, already wired in the grpc_with_sentry preset)
  • Delete plugins/grpc/interceptors/opentelemetry.go (custom interceptor duplicating the StatsHandler)
  • Migrate consumers (recovery, request_log) to read OTel SpanContextFromContext instead of ctx.Value("trace_id")
  • Drop trace_id and otlp from the grpc_with_sentry preset chain

Kitex server

  • Add github.com/kitex-contrib/obs-opentelemetry v0.3.0
  • Add suites field + SetSuites method to KitexServer; install tracing.NewServerSuite() by default in NewKitexServer
  • Delete plugins/kitex/middlewares/trace_id.go (replaced by the suite)
  • Migrate KitexRequestLogMiddleware to read OTel SpanContext
  • Drop traceIDMiddleware from NewKitexServer's default chain

Breaking change

trace_id format changes from UUID to 32-char lowercase hex (W3C traceparent). Clients must read traceparent from gRPC response trailers / Kitex response metainfo instead of the x-trace-id header.

Test Cases

Deployed gRPC client, server, kitex client, server to a local k8s cluster with Grafana, Tempo, otel collector deployed.

image image image image

trace_id in ctx
│ 2026/06/16 06:06:02 [grpc-server] SayHello: name="world" traceID=f3da771be1b5685f8d8376b2e090e8d7 spanID=3ccea26e350ed2a7 ctx_traceID= │

anson-lee-sl and others added 2 commits June 9, 2026 11:39
The previous commit (535c442) deleted grpc_with_error_reporting.go
alongside the newrelic deprecation. The file's build tag required
newrelic (`grpc && newrelic && sentry && otel`), so its removal was
technically defensible — but the *name* carries architectural meaning
beyond the newrelic dependency: it is the role-named fan-out extension
point for error reporters, distinct from the vendor-named
grpc_with_sentry.go.

Restore it as a Sentry-only preset (build tag `grpc && sentry && otel`,
no newrelic) so that:

- The vendor-named preset (DefaultGrpcServerWithSentry) stays as-is.
- The role-named preset (DefaultGrpcServerWithErrorReporting) exists
  as the designated extension point for adding additional error
  reporters (Datadog, Rollbar, Bugsnag, etc.) to the interceptor chain
  in the future, without forcing every caller to migrate off the
  vendor-named preset.

Today the two presets have identical interceptor chains; the
distinction is the file's role as a fan-out point. This matches
the pre-deletion design intent of the file (commit 090654f: 'double
write to both newrelic and sentry'), generalized beyond the
newrelic-specific double-write.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Built on top of chore/deprecate-newrelic — the newrelic files are
already gone, so this commit only touches the sentry path and Kitex.

gRPC server:
- Delete plugins/grpc/interceptors/trace_id.go (replaced by
  otelgrpc.NewServerHandler() StatsHandler, already wired in the
  grpc_with_sentry preset)
- Delete plugins/grpc/interceptors/opentelemetry.go (custom interceptor
  duplicating the StatsHandler)
- Migrate consumers (recovery, request_log) to read OTel
  SpanContextFromContext instead of ctx.Value("trace_id")
- Drop trace_id and otlp from the grpc_with_sentry preset chain

Kitex server:
- Add github.com/kitex-contrib/obs-opentelemetry v0.3.0
- Add suites field + SetSuites method to KitexServer; install
  tracing.NewServerSuite() by default in NewKitexServer
- Delete plugins/kitex/middlewares/trace_id.go (replaced by the suite)
- Migrate KitexRequestLogMiddleware to read OTel SpanContext
- Drop traceIDMiddleware from NewKitexServer's default chain

Trace_id format changes from UUID to 32-char lowercase hex (W3C
traceparent). Clients must read traceparent from gRPC response
trailers / Kitex response metainfo instead of x-trace-id header.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@anson-lee-sl anson-lee-sl force-pushed the chore/deprecate-newrelic branch from 47b8f93 to 6f3e7be Compare June 9, 2026 06:24
@anson-lee-sl anson-lee-sl marked this pull request as ready for review June 11, 2026 04:57
When neither the legacy ctx["trace_id"] key nor an OTel SpanContext is set, GetTraceID used to return "", breaking callers like plugins/pulsar/producer.go and downstream TraceIDClientInterceptor that rely on it for log/Sentry/message correlation.

Fall back to a freshly generated 16-byte OTel TraceID instead, using the same pattern as the OTel SDK's own randomIDGenerator (math/rand seeded once from crypto/rand, mutex-guarded for goroutine safety). The returned value is a valid 32-char lowercase hex string, consistent with the W3C traceparent format produced elsewhere in the OTel pipeline.
Switch from a self-managed *math/rand.Rand + sync.Mutex + crypto/rand seed to math/rand/v2's package-level functions, matching the OTel SDK's current pattern. The v2 functions are goroutine-safe so the mutex and crypto/rand seed are no longer needed.

Also adopt the SDK's binary.NativeEndian.PutUint64 split (8 bytes per Uint64) and the IsValid() retry loop, which guarantees a non-zero trace ID.
@ThomasKwan-shopline ThomasKwan-shopline self-requested a review June 16, 2026 04:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants