feat(sqs): add OpenTelemetry instrumentation for SQS client and worker#48
feat(sqs): add OpenTelemetry instrumentation for SQS client and worker#48anson-lee-sl wants to merge 6 commits into
Conversation
0ad0f98 to
36f8a76
Compare
Following OpenTelemetry Semantic Conventions 1.41.0: - Semantic conventions for AWS SQS - Semantic conventions for messaging spans
36f8a76 to
f35b9f8
Compare
|
You should update makefile to test sqs |
| } | ||
|
|
||
| func startSpan(r *request.Request) { | ||
| class, ok := classifyOperation(r.Operation.Name) |
There was a problem hiding this comment.
nit: the class variable maybe call operation is better
| if queueName != "" { | ||
| attrs = append(attrs, attribute.String("messaging.destination.name", queueName)) | ||
| } | ||
| if addr := ServerAddressFromURL(queueURL); addr != "" { |
There was a problem hiding this comment.
Maybe merge ServerAddressFromURL and ServerPortFromURL into one function, because the implementations of the functions are similar
| func injectTraceContext(r *request.Request) { | ||
| propagator := otel.GetTextMapPropagator() | ||
| switch p := r.Params.(type) { | ||
| case *aws_sqs.SendMessageInput: |
There was a problem hiding this comment.
點解淨係得send先要inject trace context, to update the sqs message with trace id?
There was a problem hiding this comment.
Yes. When receiving, we will be using addReceiveLinks to extract the trace id from the message. So
- Before send, we inject a trace id.
- Before processing the message in consumer, we extract the trace id from the message
There was a problem hiding this comment.
Before processing the message, we extract the message
漏左?
There was a problem hiding this comment.
We don't need to inject a trace id when consuming a message.
The order swap of 'sentry,otel' <-> 'otel,sentry' in the legacy // +build line was a drive-by in 1ab9cbf (ci: test sqs, sqs_worker) with no functional effect: commas are AND-joined and order is commutative, and the modern //go:build line was untouched. Restore the pre-1ab9cbf order so the file matches master.
Both helpers shared the same url.Parse + error-handling skeleton, and the two span-start sites in startSpan (otel_client.go) and startProcessSpan (otel_consumer.go) called them sequentially on the same queueURL. Replace them with HostPortFromURL(queueURL) (host, port int); call sites destructure and only record server.port when server.address is recorded. Test gains a port-bearing URL case so the port branch is exercised.
// +build uses comma-AND with right-associative nesting, so swapping the order of otel and sentry in the legacy line changes the AST even though the predicate is equivalent. go vet (run by go test) compares the two constraint ASTs and rejects the mismatch. 1ab9cbf aligned them when it added otel to the test tags; restoring the alignment.

Summary
Add OpenTelemetry instrumentation for AWS SQS client and worker, following
OTel Semantic Conventions 1.41.0 — messaging spans and AWS SQS.
Instrumented operations
SQS Client (
plugins/sqs/otel_client.go)messaging.operation.typemessaging.operation.nameSendMessagesendsendSendMessageBatchsendsend_batchReceiveMessagereceivereceiveDeleteMessagesettledeleteDeleteMessageBatchsettledelete_batchChangeMessageVisibilitysettlechange_visibilityChangeMessageVisibilityBatchsettlechange_visibility_batchSQS Worker (
plugins/sqs_worker/otel_consumer.go)SpanKindConsumerprocess span per messagetraceparentextracted fromMessageAttributesDeleteMessagecaptured as child settle spanTrace context propagation
W3C
traceparent,tracestate, andbaggageare injected into outgoing messageMessageAttributesforSendMessage/SendMessageBatch, and extracted on theconsumer side.
addReceiveLinksattaches upstream trace contexts as span linkson
ReceiveMessagespans.Files changed
plugins/sqs/
otel_carrier.go — SQSMessageCarrier adapts SQS attrs to TextMapCarrier
otel_client.go — SQS client handler hooks (Validate + Complete)
otel_client_noop.go — no-op when otel build tag is absent
otel_client_test.go — unit tests for client instrumentation
topic.go — auto-instrument SQS client in AddTopic
README.md — documentation
plugins/sqs_worker/
otel_consumer.go — process hook: extract context + start process span
otel_consumer_test.go — unit tests for consumer instrumentation
worker.go — processHook extension point (no-op default)
README.md — documentation
Build tags
-tags "sqs,otel" # SQS client instrumentation
-tags "sqs,sqs_worker,otel" # SQS client + worker instrumentation
Test Cases
Deployed
fake-sqs-producerandfake-sqs-consumerto local orbstack cluster.All 8 span types confirmed emitting in Tempo.
send89550dc2f00b34dc39ae91b771145bd9send_batch8f2a9c5b99ef38cedb8bdb4b131c26e1receive5039a668ce0bae15fc69005f13150bdelete+processc45c9188dd5aca12b9690b87d5f4656cdelete_batch703bd4d86dea75462db2a82bdef5dd19(producer) /8b4d1eb7854d241e2ee538cb77b3f55(consumer)change_visibility48ddbf96cb267d73ed149b75d8c4388c(producer) /c1238fc61cb0b195a936ddafbb71991b(consumer)change_visibility_batch60e2107e99c510b418e5021c11869353(producer) /c2ff3e2e9ee19ab5d513fe49d8b57932(consumer)