Skip to content

chore: Refactor OTel concerns in preparation for metric provider#123

Merged
morgan-wowk merged 1 commit intomasterfrom
organize-opentelemetry
Mar 5, 2026
Merged

chore: Refactor OTel concerns in preparation for metric provider#123
morgan-wowk merged 1 commit intomasterfrom
organize-opentelemetry

Conversation

@morgan-wowk
Copy link
Collaborator

@morgan-wowk morgan-wowk commented Feb 25, 2026

Refactor OpenTelemetry instrumentation into modular package

Changes:

  • Restructured OpenTelemetry code into a dedicated package with clear public API
  • Split provider setup and FastAPI instrumentation into separate functions (setup_providers() and instrument_fastapi())
  • Added configuration validation and error handling for OpenTelemetry settings
  • Implemented provider state checking to conditionally enable instrumentation only when providers are configured
  • Created internal modules for shared configuration resolution and provider state management
  • Added comprehensive test coverage for all new modules and functions
  • Updated main application files to use the new simplified API calls

Copy link
Collaborator Author

morgan-wowk commented Feb 25, 2026

@morgan-wowk morgan-wowk self-assigned this Feb 25, 2026
@morgan-wowk morgan-wowk force-pushed the organize-opentelemetry branch 2 times, most recently from 8a5aa83 to 7672232 Compare February 25, 2026 10:13
@morgan-wowk morgan-wowk changed the title chore: Organize opentelemetry concerns in preparation for metric provider chore: Organize OTel concerns in preparation for metric provider Feb 25, 2026
@morgan-wowk morgan-wowk changed the title chore: Organize OTel concerns in preparation for metric provider chore: Refactor OTel concerns in preparation for metric provider Feb 25, 2026
@morgan-wowk morgan-wowk force-pushed the organize-opentelemetry branch from 7672232 to 0d953d1 Compare February 25, 2026 13:01
@morgan-wowk morgan-wowk marked this pull request as ready for review February 25, 2026 19:53
@morgan-wowk morgan-wowk requested a review from Ark-kun as a code owner February 25, 2026 19:53
Copy link
Collaborator

@yuechao-qin yuechao-qin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only

@morgan-wowk morgan-wowk force-pushed the organize-opentelemetry branch from 0d953d1 to f2c09c1 Compare February 26, 2026 00:56
Copy link
Collaborator Author

@yuechao-qin Feedback addressed.

# region: OpenTelemetry initialization
from cloud_pipelines_backend.instrumentation import opentelemetry as otel

otel.setup_providers()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: for local setup, should otel service be run separately? What would happen if no OTEL endpoint is configured (e.g. in case of e2e tests on CI)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great question.

I'll answer in two ways.

For local development, we have this example stack I added and have personally been using to test traces and metrics for OSS: https://github.com/TangleML/tangle/tree/master/examples/observability/otel-jaeger-prometheus

For writing tests, or adding new metrics, here are some important things to know:

  • The OTel Python SDK that we consume, and even our FastAPI auto-instrumentation consumes, is intentionally designed so that every action is a NOOP if no providers are set.
    • It means, no exceptions from emitting metrics or opening traces when OTel is not configured.
  • We only set providers on our end if they have filled out correctly shaped values for the environment variables.
  • We only auto-instrument FastAPI if providers are set.
  • For verfiying a metric, or trace, was used, via testing (whether its unit or e2e) that is possible and I'll follow up with instructions and examples in a separate PR. Consider this is a TODO. We will have to manipulate the type of reader / exporter we use at the start of tests.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is an issue: #132

@Ark-kun
Copy link
Contributor

Ark-kun commented Mar 3, 2026

Updated main application files to use the new simplified API calls

Hmm. Are you sure we're simplifying things? It's hard to get more simple than the current 1-liner:

otel_tracing.setup_api_tracing(app)

Copy link
Collaborator Author

morgan-wowk commented Mar 3, 2026

Updated main application files to use the new simplified API calls

Hmm. Are you sure we're simplifying things? It's hard to get more simple than the current 1-liner:

otel_tracing.setup_api_tracing(app)

​It's more about decoupling. The reason it's better this way is because we will have scenarios where we want providers but don't want auto-instrumentation for FastAPI.

I can add a function setup_providers_and_fastapi but it feels like adding a function just to have 1 line instead of 2, you know?

Copy link
Collaborator Author

If you would like such a function, I'll add it!

Copy link
Contributor

Ark-kun commented Mar 4, 2026

Let's keep the tests inside /tests/. We can have subfolders there like instrumentation or telemetry

def test_uses_custom_service_name(self, monkeypatch):
monkeypatch.setenv("TANGLE_OTEL_EXPORTER_ENDPOINT", "http://localhost:4317")

result = config.resolve(service_name="oasis-api")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to switch naming from Oasis to Tangle

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed

Copy link
Contributor

@Ark-kun Ark-kun Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming module with a singular noun often conflicts with local variables. Better use plural or more uncountable. (Maybe, configuration is OK despite technically being singular noun...)

Kubernetes has

from kubernetes import config
from kubernetes import client

and it's very inconvenient, forcing me to rename those imports every time.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair! Changed.

Copy link
Collaborator Author

morgan-wowk commented Mar 4, 2026

Let's keep the tests inside /tests/. We can have subfolders there like instrumentation or telemetry

​Sure thing.

Increasingly over time, our root /tests directory will become difficult to search through. I find that it is already becoming that way.

We can stick with the root-level tests for this PR. Here is the comparison of the two options for future.

image.png

Copy link
Collaborator Author

☝️ We can talk about this more in the future. There might be some "Pythonic" preference I am not aware of here so I have an open mind on the current root level tests strategy.

I've done both root-level testing and co-located testing on small and large projects. Co-location for its many benefits has outweighed any cons for my previous teams.

As our backend grows, we may experience some of the pains of a root level tests directory. If we do, let's consider giving co-location a try. I know we plan to add many tests in the near future and it could come up sooner than later as a desire.

CC: @yuechao-qin

@morgan-wowk morgan-wowk force-pushed the organize-opentelemetry branch from f2c09c1 to 767ac8c Compare March 5, 2026 00:24
Copy link
Collaborator Author

Let's keep the tests inside /tests/. We can have subfolders there like instrumentation or telemetry

Moved

…ider

**Changes:**

* Better organize opentelemetry code to create a better API for callers (various entrypoints) to:
    * Setup providers (tracing, metrics, logging eventually)
    * Setup FastAPI auto instrumentation

Made-with: Cursor
@morgan-wowk morgan-wowk force-pushed the organize-opentelemetry branch from 767ac8c to 7251a89 Compare March 5, 2026 00:40
Copy link
Collaborator

I do prefer co-located project structure myself.

However, if my memory serves me correctly, Python doesn't support colocated tests (like Angular's .spec.ts) because there's no build step to exclude them — test files end up getting packaged and imported accidentally, pytest is designed to auto-discover from a separate tests/ dir out of the box.

@Ark-kun
Copy link
Contributor

Ark-kun commented Mar 5, 2026

Co-location tests has advantages, but also have several cons.
Tests get included in the installed package, problems with test data, imports etc.

Usually there is 1-1 mapping between modules and test modules:

my_modeule.py
my_modeule_tests.py

But when tests start to get their own conftest.py files, test data, utility modules, etc, I think it's definitely time for them to live in their own place.


[tool.pytest.ini_options]
testpaths = ["tests"]
addopts = "--import-mode=importlib"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which errors required making this change?
Is this still needed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will run the tests again tomorrow and confirm. And paste the error if it is still present.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The answer is yes, it is still needed.

Context:
--import-mode=importlib is required because the test layout has two files with the same name: test_providers.py exists in both tests/instrumentation/opentelemetry/ and tests/instrumentation/opentelemetry/_internal/. Pytest's default prepend import mode can't distinguish between identically-named test modules in sibling directories — it errors with "which is not the same as the test file we want to collect". importlib mode gives each file a unique identity based on its full path rather than its module name, which resolves the collision.

Copy link
Contributor

@Ark-kun Ark-kun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, Morgan. Approved with comment (about pyproject.toml change)

Copy link
Collaborator Author

Good to know about this behaviour in Python. Thanks guys

Copy link
Collaborator Author

morgan-wowk commented Mar 5, 2026

Merge activity

  • Mar 5, 9:25 PM UTC: A user started a stack merge that includes this pull request via Graphite.
  • Mar 5, 9:25 PM UTC: @morgan-wowk merged this pull request with Graphite.

@morgan-wowk morgan-wowk merged commit 63c2d81 into master Mar 5, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants