Skip to content

fix: quote attribute/resource_attribute values in bundled-config templates#338

Merged
sfc-gh-zeningchen merged 3 commits into
mainfrom
fix/ob-60193-quote-resource-attribute-values
Jun 1, 2026
Merged

fix: quote attribute/resource_attribute values in bundled-config templates#338
sfc-gh-zeningchen merged 3 commits into
mainfrom
fix/ob-60193-quote-resource-attribute-values

Conversation

@sfc-gh-zeningchen
Copy link
Copy Markdown
Collaborator

@sfc-gh-zeningchen sfc-gh-zeningchen commented May 7, 2026

Summary

The bundled-config templates that render the operator's attributes: and resource_attributes: blocks into the otelcol attributes/resource processor configs emitted unquoted YAML scalars:

- key: {{ $key }}
  value: {{ $value }}            # bare scalar — string typing lost

When otelcol then loaded the rendered config, its YAML parser interpreted any value that happened to look like a typed scalar as that type. resourceprocessor.Action.Value is interface{}, so the typed value was stamped onto every span, metric, and log going through the affected pipelines.

The fix routes each value through the existing toYaml template helper (internal/utils/templatefuncs.go), which marshals through goccy/go-yaml and returns a canonical YAML scalar — guaranteed to be a valid scalar by construction, with proper handling of values that contain quotes, newlines, or other characters that need escaping.

Concrete cases observed (OB-60193)

Operator setting Without fix → on signals With fix → on signals
service.version: "060520262036" int64 6,530,622,494 (octal) "060520262036"
service.version: "060520261956" int64 60,520,261,956 (decimal, leading 0 stripped) "060520261956"
team.name: "no" bool false (YAML 1.1 "Norway problem") "no"
service.namespace: "1.0" float64 1 "1.0"
Values with : # [ ] { } , parse breaks / shape changes preserved ✓

Affected pipelines

Anything using resource/observe_global_resource_attributes or attributes/observe_global_attributes:

  • traces / metrics / logs forward (forward.yaml.tmpl)
  • host & process metrics, host logs (Linux/Docker/Windows variants)
  • R.E.D metrics, self-monitoring, fleet heartbeat

So the same misrendered attribute lands wrong on every signal — not just one dataset.

Origin

Discovered during validation of 4 Java Spring Boot messaging apps where sample-app-validator sets a digit-only timestamp service.version. The empty Tracing/Deployment dataset led upstream to this template. See OB-60193 for the full investigation, including the Python SDK reproduction that isolated the bug to the agent (Observe backend correctly preserves strings that arrive intact on the wire).

Test plan

  • go test ./internal/... ./components/... — all packages PASS

  • Test_RenderOtelConfig — 14/14 snapshot tests PASS

  • Extended snap1-full-agent-config.yaml with regression fixtures covering each coercion class still active under YAML 1.2 (otelcol's confmap loader uses gopkg.in/yaml.v3, which dropped the YAML 1.1 yes/no/on/off boolean aliases — so the Norway-problem case from the bug report no longer manifests through this loader and was intentionally omitted as a test case):

    Input Without fix With fix
    bool-attr: "true" value: true (bool) value: "true"
    digits-attr: "12345" value: 12345 (int64) value: "12345"
    service.namespace: "1.0" value: 1 (float→int) value: "1.0"
    service.version: "060520262300" value: 6530622656 (octal int64) value: "060520262300"
  • Verified each of the four added cases is regression-catching: with the inputs in place, reverting just the template change produces a diff on every line above.

  • observe-agent config against /etc/observe-agent/observe-agent.yaml with digit-only service.version — rendered output now contains value: "060520262300" (verified manually before this PR).

Copy link
Copy Markdown

@orca-security-us orca-security-us Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Orca Security Scan Summary

Status Check Issues by priority
Passed Passed Infrastructure as Code high 0   medium 0   low 0   info 0 View in Orca
Passed Passed SAST high 0   medium 0   low 0   info 0 View in Orca
Passed Passed Secrets high 0   medium 0   low 0   info 0 View in Orca
Passed Passed Vulnerabilities high 0   medium 0   low 0   info 0 View in Orca

{{- range $key, $value := .Attributes }}
- key: {{ $key }}
value: {{ $value }}
value: {{ printf "%q" $value }}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find! Can this use the toYaml template function instead of printf? I don't know if %q is guaranteed to match the yaml format.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, that's a good point! I just updated it to use toYaml

@sfc-gh-zeningchen sfc-gh-zeningchen force-pushed the fix/ob-60193-quote-resource-attribute-values branch 2 times, most recently from 9d7f23f to d7c6924 Compare May 7, 2026 21:17
…lates

The templates that render the operator's `attributes:` and `resource_attributes:`
config blocks into the bundled otelcol `attributes`/`resource` processor configs
emitted unquoted YAML scalars:

  - key: {{ $key }}
    value: {{ $value }}            # bare scalar — string typing lost

When otelcol then loaded the rendered config, its YAML parser interpreted any
value that happened to look like a YAML 1.1 typed scalar (octal/decimal digits,
booleans, dates, null, hex) as that type. The resourceprocessor.Action.Value
field is `interface{}`, so the typed value was injected onto every span,
metric, and log going through the affected pipelines.

Concrete cases observed during validation (OB-60193):
  - service.version="060520262036"  -> int64 6,530,622,494 (octal)
  - service.version="060520261956"  -> int64 60,520,261,956 (decimal,
                                       leading zero dropped)
  - team.name="no"                  -> bool false  (the YAML "Norway problem")
  - service.namespace="1.0"         -> float64 1
  - any value with `:` `#` `[` `,` etc. in flow-syntax position -> parse breaks

The fix routes each value through the existing `toYaml` template helper
(internal/utils/templatefuncs.go), which marshals through goccy/go-yaml and
returns a canonical YAML scalar — guaranteed to be a valid scalar by
construction, with proper handling of values that contain quotes, newlines,
or other characters that need escaping.

Snapshot test snap1 was extended with regression fixtures covering the
distinct coercion classes that still fire under YAML 1.2 (otelcol's confmap
loader uses gopkg.in/yaml.v3, which dropped the YAML 1.1 yes/no/on/off bool
aliases — so the Norway-problem class observed in the bug report no longer
manifests through this loader and was not added as a test case):

  attributes:
    bool-attr:    "true"            -> bool true        (without fix)
    digits-attr:  "12345"           -> int64 12345      (without fix)
  resource_attributes:
    service.namespace: "1.0"        -> float64 1        (without fix)
    service.version: "060520262300" -> int64 6530622656 (octal, without fix)

The four snap1 output files (linux/macos/windows/docker) now show these
values quoted; reverting just the template change makes each one diff back
to the unquoted, type-coerced form.

Affected pipelines (anything using `resource/observe_global_resource_attributes`
or `attributes/observe_global_attributes`):
  - traces/metrics/logs forward
  - host/process metrics, host logs
  - R.E.D metrics
  - self-monitoring
  - fleet heartbeat

The bug surfaced during validation of 4 Java Spring Boot messaging apps where
`sample-app-validator` sets a digit-only timestamp service.version. With this
fix, Tracing/Deployment indexing works correctly for all-digit version strings.

Tested:
  - `go test ./internal/...` — all green
  - `Test_RenderOtelConfig` — 14 snapshot tests pass
  - Reverted just the template (kept expanded snap1 inputs) — each of the
    four added cases produces the expected coerced-value diff, confirming
    the regression coverage is real.
  - `observe-agent config` against /etc/observe-agent/observe-agent.yaml with
    digit-only service.version — rendered output now contains
    `value: "060520262300"`
@sfc-gh-zeningchen sfc-gh-zeningchen force-pushed the fix/ob-60193-quote-resource-attribute-values branch from d7c6924 to 424d183 Compare May 7, 2026 21:25
@sfc-gh-zeningchen sfc-gh-zeningchen merged commit 6d78f67 into main Jun 1, 2026
14 checks passed
@sfc-gh-zeningchen sfc-gh-zeningchen deleted the fix/ob-60193-quote-resource-attribute-values branch June 1, 2026 17:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants