From 51ebc3c5b0117e4cc127a376b3659f515de120f2 Mon Sep 17 00:00:00 2001 From: Max Novak Date: Tue, 30 Jun 2026 22:45:00 +0100 Subject: [PATCH] feat: add OpenTelemetry metrics (requests, duration, errors) Publishes three instruments from a new GoogleMapsMetrics meter ("GoogleMapsApi", same name as the trace source): gmaps.client.requests (counter), gmaps.client.request.duration (histogram, seconds), and gmaps.client.request.errors (counter). Recorded once per call in the engine and zero-cost until a listener subscribes, mirroring the existing tracing; no new package dependency. The ILogger half of the issue is intentionally left out and documented as a non-goal. Closes #296 --- .agents/known-issues.md | 1 - .agents/observability.md | 39 ++++- GoogleMapsApi.Test/DiagnosticsMetricsTests.cs | 149 ++++++++++++++++++ .../Diagnostics/GoogleMapsMetrics.cs | 87 ++++++++++ GoogleMapsApi/Engine/MapsAPIGenericEngine.cs | 31 ++-- GoogleMapsApi/PublicAPI.Unshipped.txt | 2 + README.md | 33 +++- 7 files changed, 325 insertions(+), 17 deletions(-) create mode 100644 GoogleMapsApi.Test/DiagnosticsMetricsTests.cs create mode 100644 GoogleMapsApi/Diagnostics/GoogleMapsMetrics.cs diff --git a/.agents/known-issues.md b/.agents/known-issues.md index dc18ebb..97d9609 100644 --- a/.agents/known-issues.md +++ b/.agents/known-issues.md @@ -11,7 +11,6 @@ When you fix one, remove it from this list (and update the relevant `.agents/*.m | --- | --- | --- | --- | --- | | **B3** | medium | `DurationJsonConverter` and `OverviewPolylineJsonConverter` resolve members by name via reflection (`GetProperty`) — rename-fragile, and they silently accept partial objects (a missing token leaves that field unset). | `Engine/JsonConverters/DurationJsonConverter.cs`, `Engine/JsonConverters/OverviewPolylineJsonConverter.cs` | Replace reflection with direct (de)serialization of explicit DTOs. | | **B4** | low | HTTP status tag is set on `Activity.Current` instead of the captured `activity`, so it can misattribute under unusual ambient-Activity nesting. | `Engine/MapsAPIGenericEngine.cs:122` | Capture `activity` and tag it directly; pass it down or set the tag inside the parent scope. See [`observability.md`](observability.md). | -| **B5** | low | Observability is tracing-only — no `ILogger` integration and no metrics (counters/histograms). | `Engine/MapsAPIGenericEngine.cs` | Optional: add an `ILogger`-based debug log and OTel metrics (request count, duration histogram, error count). | | **B7** | low | Inconsistent SSL enforcement (`TimeZoneRequest` forces HTTPS, others don't) and mixed enum handling (some `[EnumMember]`, some bare names). | `Entities/TimeZone/Request/TimeZoneRequest.cs`; various `Entities/**/Response/Status*.cs` | Decide one SSL policy; standardize on `[EnumMember]` (or document when it's intentionally omitted). | | **B8** | medium | No manual approval gate before NuGet publish (a `v*` tag push publishes immediately), and Renovate auto-merge assumes branch protection is configured. | `.github/workflows/nuget.yml`, `renovate.json` | Confirm `master` branch protection requires CI; optionally add a release approval environment. (Process decision for the maintainer.) See [`build-release-ci.md`](build-release-ci.md). | diff --git a/.agents/observability.md b/.agents/observability.md index 01a8710..e4997e6 100644 --- a/.agents/observability.md +++ b/.agents/observability.md @@ -1,7 +1,8 @@ # Observability, events & error handling -Tracing, the two diagnostic events, and how failures surface. All of this lives in -`GoogleMapsApi/Engine/MapsAPIGenericEngine.cs` and `GoogleMapsApi/Diagnostics/GoogleMapsActivity.cs`. +Tracing, metrics, the two diagnostic events, and how failures surface. All of this lives in +`GoogleMapsApi/Engine/MapsAPIGenericEngine.cs`, `GoogleMapsApi/Diagnostics/GoogleMapsActivity.cs`, and +`GoogleMapsApi/Diagnostics/GoogleMapsMetrics.cs`. ## OpenTelemetry tracing @@ -39,9 +40,37 @@ On exception the span is set to `ActivityStatusCode.Error` with the message and > `GetHttpResponseAsync` rather than the captured `activity` reference (`MapsAPIGenericEngine.cs:122`). > Fine in the common path; can misattribute under unusual ambient-Activity nesting. See > [`known-issues.md`](known-issues.md). -> -> **Gaps (B5):** tracing only — there is no `ILogger` integration and no metrics (counters/histograms). -> Latency is implicit in span duration. Logging is a candidate enhancement. + +## Metrics + +The library also records **three metric instruments per call**, published from a `Meter` named +**`GoogleMapsMetrics.MeterName == "GoogleMapsApi"`** (same name as the trace source). Like the tracing, the +instruments are zero-cost until a listener subscribes — `GoogleMapsMetrics.Record` early-returns when no +instrument is `Enabled`. Recording happens once per call in the `finally` of `QueryGoogleAPIAsync`. + +Consumer wiring (OpenTelemetry): + +```csharp +meterProviderBuilder.AddMeter(GoogleMapsApi.Diagnostics.GoogleMapsMetrics.MeterName); +``` + +| Instrument | Type | Unit | Tags | +| --- | --- | --- | --- | +| `gmaps.client.requests` | `Counter` | `{request}` | `gmaps.api`, `http.request.method` | +| `gmaps.client.request.duration` | `Histogram` | `s` | `gmaps.api`, `http.request.method`, `http.response.status_code`*, `gmaps.response_status`* | +| `gmaps.client.request.errors` | `Counter` | `{error}` | `gmaps.api`, `error.type` | + +\* added only when available (a response arrived / the response type exposes a `Status`). + +The request counter increments for **every** attempt (success or failure); the error counter increments only when +the call throws. Mirroring the trace model, Google business statuses (`ZERO_RESULTS`, `OVER_QUERY_LIMIT`, …) arrive +as `200 OK`, do **not** throw, and so are **not** counted as errors — they are visible via `gmaps.response_status` +on the duration histogram. + +> **Non-goal — `ILogger`:** there is deliberately no `ILogger` integration. The OTel tracing already carries +> method/host/redacted URL/status/`error.type`/duration; the library has no retry logic to "warn" on; and the +> static HTTP engine makes threading an instance logger costly. Users who want logs should bridge the spans to a +> log sink rather than expect a built-in logger. ## The two events (`IEngineFacade`) diff --git a/GoogleMapsApi.Test/DiagnosticsMetricsTests.cs b/GoogleMapsApi.Test/DiagnosticsMetricsTests.cs new file mode 100644 index 0000000..74b4472 --- /dev/null +++ b/GoogleMapsApi.Test/DiagnosticsMetricsTests.cs @@ -0,0 +1,149 @@ +using System; +using System.Collections.Generic; +using System.Diagnostics.Metrics; +using System.Linq; +using System.Net; +using System.Net.Http; +using System.Threading; +using System.Threading.Tasks; +using GoogleMapsApi.Diagnostics; +using GoogleMapsApi.Entities.Geocoding.Request; +using NUnit.Framework; + +namespace GoogleMapsApi.Test +{ + /// + /// Hermetic tests asserting that each API call records to the metric instruments published from the + /// meter. No live network calls — every HTTP exchange is + /// intercepted by . + /// + [TestFixture] + public class DiagnosticsMetricsTests + { + private const string OkGeocodingJson = "{\"status\":\"OK\",\"results\":[]}"; + + [Test] + public async Task QueryAsync_Success_RecordsRequestAndDuration_NoErrors() + { + using var collector = new MetricsCollector(); + + using var handler = new StubHandler(HttpStatusCode.OK, OkGeocodingJson); + using var http = new HttpClient(handler); + var client = new GoogleMapsClient(http, new GoogleMapsClientOptions { ApiKey = "k" }); + + await client.Geocode.QueryAsync(new GeocodingRequest { Address = "1600 Amphitheatre Pkwy" }); + + var requests = collector.Single("gmaps.client.requests"); + Assert.That(requests.Value, Is.EqualTo(1L)); + Assert.That(requests.Tags["gmaps.api"], Is.EqualTo("Geocoding")); + Assert.That(requests.Tags["http.request.method"], Is.EqualTo("GET")); + Assert.That(requests.Tags.ContainsKey("http.response.status_code"), Is.False); + + var duration = collector.Single("gmaps.client.request.duration"); + Assert.That(duration.Value, Is.GreaterThanOrEqualTo(0.0)); + Assert.That(duration.Tags["gmaps.api"], Is.EqualTo("Geocoding")); + Assert.That(duration.Tags["http.response.status_code"], Is.EqualTo(200)); + Assert.That(duration.Tags["gmaps.response_status"], Is.EqualTo("OK")); + + Assert.That(collector.Count("gmaps.client.request.errors"), Is.EqualTo(0)); + } + + [Test] + public void QueryAsync_HttpError_RecordsErrorAndDuration() + { + using var collector = new MetricsCollector(); + + using var handler = new StubHandler(HttpStatusCode.Forbidden, "denied"); + using var http = new HttpClient(handler); + var client = new GoogleMapsClient(http, new GoogleMapsClientOptions { ApiKey = "k" }); + + Assert.ThrowsAsync( + (Func)(() => client.Geocode.QueryAsync(new GeocodingRequest { Address = "a" }))); + + Assert.That(collector.Single("gmaps.client.requests").Value, Is.EqualTo(1L)); + + var error = collector.Single("gmaps.client.request.errors"); + Assert.That(error.Value, Is.EqualTo(1L)); + Assert.That(error.Tags["gmaps.api"], Is.EqualTo("Geocoding")); + Assert.That(error.Tags["error.type"], Is.Not.Null); + + var duration = collector.Single("gmaps.client.request.duration"); + Assert.That(duration.Tags["http.response.status_code"], Is.EqualTo(403)); + } + + [Test] + public async Task QueryAsync_WithoutListener_DoesNotThrow() + { + using var handler = new StubHandler(HttpStatusCode.OK, OkGeocodingJson); + using var http = new HttpClient(handler); + var client = new GoogleMapsClient(http, new GoogleMapsClientOptions { ApiKey = "k" }); + + var response = await client.Geocode.QueryAsync(new GeocodingRequest { Address = "a" }); + + Assert.That(response, Is.Not.Null); + } + + /// Captures measurements from the GoogleMapsApi meter for the duration of a test. + private sealed class MetricsCollector : IDisposable + { + private readonly MeterListener _listener; + private readonly List<(string Instrument, double Value, Dictionary Tags)> _records = new(); + + public MetricsCollector() + { + _listener = new MeterListener + { + InstrumentPublished = (instrument, listener) => + { + if (instrument.Meter.Name == GoogleMapsMetrics.MeterName) + listener.EnableMeasurementEvents(instrument); + } + }; + _listener.SetMeasurementEventCallback( + (inst, value, tags, _) => Add(inst.Name, value, tags)); + _listener.SetMeasurementEventCallback( + (inst, value, tags, _) => Add(inst.Name, value, tags)); + _listener.Start(); + } + + private void Add(string instrument, double value, ReadOnlySpan> tags) + { + var dict = new Dictionary(); + foreach (var tag in tags) + dict[tag.Key] = tag.Value; + _records.Add((instrument, value, dict)); + } + + public int Count(string instrument) => _records.Count(r => r.Instrument == instrument); + + public (double Value, Dictionary Tags) Single(string instrument) + { + var record = _records.Single(r => r.Instrument == instrument); + return (record.Value, record.Tags); + } + + public void Dispose() => _listener.Dispose(); + } + + private sealed class StubHandler : HttpMessageHandler + { + private readonly HttpStatusCode _statusCode; + private readonly string _responseBody; + + public StubHandler(HttpStatusCode statusCode, string responseBody) + { + _statusCode = statusCode; + _responseBody = responseBody; + } + + protected override Task SendAsync(HttpRequestMessage request, CancellationToken cancellationToken) + { + cancellationToken.ThrowIfCancellationRequested(); + return Task.FromResult(new HttpResponseMessage(_statusCode) + { + Content = new StringContent(_responseBody) + }); + } + } + } +} diff --git a/GoogleMapsApi/Diagnostics/GoogleMapsMetrics.cs b/GoogleMapsApi/Diagnostics/GoogleMapsMetrics.cs new file mode 100644 index 0000000..85ebc87 --- /dev/null +++ b/GoogleMapsApi/Diagnostics/GoogleMapsMetrics.cs @@ -0,0 +1,87 @@ +using System.Collections.Generic; +using System.Diagnostics; +using System.Diagnostics.Metrics; + +namespace GoogleMapsApi.Diagnostics +{ + /// + /// Metrics entry point for the library. Every Google Maps API call records to a small set of + /// instruments published from the named + /// . Subscribe to it from your metrics pipeline — for OpenTelemetry, + /// call AddMeter(GoogleMapsMetrics.MeterName) on your MeterProviderBuilder. + /// Instrumentation is inert (no measurements taken) until a listener subscribes, so it is always on. + /// + /// + /// Three instruments are published, all tagged with gmaps.api (the service name, e.g. + /// Geocoding): + /// + /// gmaps.client.requests — counter of every call attempted (success or failure). + /// gmaps.client.request.duration — histogram of call latency in seconds. + /// gmaps.client.request.errors — counter of calls that threw (transport/HTTP failures). + /// + /// Google "business" errors (for example ZERO_RESULTS or OVER_QUERY_LIMIT) arrive as a + /// 200 OK and do not throw, so they are not counted as errors; they are visible through the + /// gmaps.response_status tag on the duration histogram. + /// + public static class GoogleMapsMetrics + { + /// + /// The name instruments are published from: + /// "GoogleMapsApi" (the same name as the tracing source). Pass this to AddMeter(...) + /// to collect the library's metrics. + /// + public const string MeterName = "GoogleMapsApi"; + + internal static readonly Meter Meter = + new Meter(MeterName, typeof(GoogleMapsMetrics).Assembly.GetName().Version?.ToString()); + + internal static readonly Counter Requests = Meter.CreateCounter( + "gmaps.client.requests", unit: "{request}", + description: "Number of Google Maps API calls attempted."); + + internal static readonly Histogram RequestDuration = Meter.CreateHistogram( + "gmaps.client.request.duration", unit: "s", + description: "Duration of Google Maps API calls."); + + internal static readonly Counter RequestErrors = Meter.CreateCounter( + "gmaps.client.request.errors", unit: "{error}", + description: "Number of Google Maps API calls that failed with an exception."); + + internal static void Record( + string apiName, + string method, + int? statusCode, + string? responseStatus, + string? errorType, + long startTimestamp) + { + if (!Requests.Enabled && !RequestDuration.Enabled && !RequestErrors.Enabled) + return; + + // Stopwatch.GetElapsedTime is unavailable on netstandard2.0, so compute seconds directly. + var elapsedSeconds = (Stopwatch.GetTimestamp() - startTimestamp) / (double)Stopwatch.Frequency; + + var requestTags = new TagList + { + { "gmaps.api", apiName }, + { "http.request.method", method }, + }; + + Requests.Add(1, requestTags); + + var durationTags = requestTags; + if (statusCode is not null) + durationTags.Add("http.response.status_code", statusCode.Value); + if (responseStatus is not null) + durationTags.Add("gmaps.response_status", responseStatus); + RequestDuration.Record(elapsedSeconds, durationTags); + + if (errorType is not null) + { + RequestErrors.Add(1, + new KeyValuePair("gmaps.api", apiName), + new KeyValuePair("error.type", errorType)); + } + } + } +} diff --git a/GoogleMapsApi/Engine/MapsAPIGenericEngine.cs b/GoogleMapsApi/Engine/MapsAPIGenericEngine.cs index 5a2d3a3..305822f 100644 --- a/GoogleMapsApi/Engine/MapsAPIGenericEngine.cs +++ b/GoogleMapsApi/Engine/MapsAPIGenericEngine.cs @@ -49,21 +49,32 @@ internal static async Task QueryGoogleAPIAsync( var body = request.GetRequestBody(); var apiName = GetApiName(); + var method = body is null ? "GET" : "POST"; using var activity = GoogleMapsActivity.Source.StartActivity($"GoogleMapsApi {apiName}", ActivityKind.Client); if (activity is not null) { activity.SetTag("gmaps.api", apiName); - activity.SetTag("http.request.method", body is null ? "GET" : "POST"); + activity.SetTag("http.request.method", method); activity.SetTag("server.address", uri.Host); activity.SetTag("url.full", RedactUrl(uri)); } + var startTimestamp = Stopwatch.GetTimestamp(); + int? statusCode = null; + string? responseStatus = null; + string? errorType = null; + Action onResponseStatus = code => + { + statusCode = code; + activity?.SetTag("http.response.status_code", code); + }; + try { // Binary endpoints (e.g. Solar GeoTIFF) carry raw bytes, not JSON. if (typeof(IBinaryResponse).IsAssignableFrom(typeof(TResponse))) { - var (bytes, contentType) = await SendAsync(httpClient, uri, body, timeout, token, activity, async response => + var (bytes, contentType) = await SendAsync(httpClient, uri, body, timeout, token, onResponseStatus, async response => (await response.Content.ReadAsByteArrayAsync().ConfigureAwait(false), response.Content.Headers.ContentType?.MediaType)).ConfigureAwait(false); @@ -76,17 +87,17 @@ internal static async Task QueryGoogleAPIAsync( return binaryResult; } - var responseContent = await SendAsync(httpClient, uri, body, timeout, token, activity, + var responseContent = await SendAsync(httpClient, uri, body, timeout, token, onResponseStatus, response => response.Content.ReadAsStringAsync()).ConfigureAwait(false); onRawResponseReceived?.Invoke(Encoding.UTF8.GetBytes(responseContent)); var response = JsonSerializer.Deserialize(responseContent, jsonOptions)!; - if (activity is not null) + if (activity is not null || GoogleMapsMetrics.RequestDuration.Enabled) { - var responseStatus = GetResponseStatus(response); - if (responseStatus is not null) + responseStatus = GetResponseStatus(response); + if (activity is not null && responseStatus is not null) activity.SetTag("gmaps.response_status", responseStatus); } @@ -94,13 +105,15 @@ internal static async Task QueryGoogleAPIAsync( } catch (Exception ex) { + errorType = ex.GetType().FullName; activity?.SetStatus(ActivityStatusCode.Error, ex.Message); - activity?.SetTag("error.type", ex.GetType().FullName); + activity?.SetTag("error.type", errorType); throw; } finally { body?.Dispose(); + GoogleMapsMetrics.Record(apiName, method, statusCode, responseStatus, errorType, startTimestamp); } } @@ -121,7 +134,7 @@ private static string RedactUrl(Uri uri) return property?.GetValue(response)?.ToString(); } - private static async Task SendAsync(HttpClient httpClient, Uri uri, HttpContent? body, TimeSpan timeout, CancellationToken cancellationToken, Activity? activity, Func> readContent) + private static async Task SendAsync(HttpClient httpClient, Uri uri, HttpContent? body, TimeSpan timeout, CancellationToken cancellationToken, Action? onResponseStatus, Func> readContent) { using var cts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken); if (timeout != TimeSpan.FromMilliseconds(-1)) @@ -132,7 +145,7 @@ private static async Task SendAsync(HttpClient httpClient, Uri uri, HttpCo using var response = body == null ? await httpClient.GetAsync(uri, cts.Token).ConfigureAwait(false) : await httpClient.PostAsync(uri, body, cts.Token).ConfigureAwait(false); - activity?.SetTag("http.response.status_code", (int)response.StatusCode); + onResponseStatus?.Invoke((int)response.StatusCode); await HandleHttpResponse(response, timeout).ConfigureAwait(false); return await readContent(response).ConfigureAwait(false); } diff --git a/GoogleMapsApi/PublicAPI.Unshipped.txt b/GoogleMapsApi/PublicAPI.Unshipped.txt index 4082e24..572952d 100644 --- a/GoogleMapsApi/PublicAPI.Unshipped.txt +++ b/GoogleMapsApi/PublicAPI.Unshipped.txt @@ -813,3 +813,5 @@ GoogleMapsApi.IGoogleMapsClient.PollenForecast.get -> GoogleMapsApi.IEngineFacad GoogleMapsApi.IGoogleMapsClient.PollenHeatmapTile.get -> GoogleMapsApi.IEngineFacade! override GoogleMapsApi.Entities.Pollen.Request.PollenForecastRequest.GetUri() -> System.Uri! override GoogleMapsApi.Entities.Pollen.Request.PollenHeatmapTileRequest.GetUri() -> System.Uri! +GoogleMapsApi.Diagnostics.GoogleMapsMetrics +const GoogleMapsApi.Diagnostics.GoogleMapsMetrics.MeterName = "GoogleMapsApi" -> string! diff --git a/README.md b/README.md index 4f130ce..a2216d0 100644 --- a/README.md +++ b/README.md @@ -45,7 +45,7 @@ Google ships official .NET packages primarily for its *newer* gRPC APIs — `Goo | Runtime dependencies | Lightweight: `System.Text.Json` on modern .NET; small compatibility helpers on `netstandard2.0` | gRPC stack: `Google.Api.Gax.Grpc`, `Google.Geo.Type`, Protobuf/gRPC dependencies; `Grpc.Core` on .NET Framework | | Maturity | Stable 2.x, 2M+ downloads | Several Maps packages still in beta (`1.0.0-betaNN`) | | DI / `IHttpClientFactory` | [`AddGoogleMaps(...)`](#instance-based-client-ihttpclientfactory-friendly) extension | `ClientBuilder` pattern; no `IHttpClientFactory` story | -| Observability | [OpenTelemetry](#observability-opentelemetry-tracing) span per call (API key redacted) | None built-in | +| Observability | [OpenTelemetry](#observability-opentelemetry-tracing--metrics) tracing span + metrics per call (API key redacted) | None built-in | **Prefer Google's official packages when** you need gRPC transport or streaming, deep integration with other Google Cloud client libraries, or Google's own support — and you only consume one of the gRPC-backed APIs. Otherwise, a single idiomatic package that also covers the web-service APIs is usually the friendlier choice. @@ -385,7 +385,9 @@ DirectionsResponse directions = maps.Directions.QueryAsync(directionsRequest).Ge Console.WriteLine(directions); ``` -## Observability (OpenTelemetry tracing) +## Observability (OpenTelemetry tracing & metrics) + +### Tracing Every API call emits a [distributed-tracing](https://opentelemetry.io/docs/concepts/signals/traces/) span from an [`ActivitySource`](https://learn.microsoft.com/dotnet/core/diagnostics/distributed-tracing) named **`GoogleMapsApi`**. @@ -417,6 +419,33 @@ representing call latency. Tags follow OpenTelemetry HTTP semantic conventions p Failures set the span status to `Error` and add an `error.type` tag. +### Metrics + +Every call also records to three instruments published from a [`Meter`](https://learn.microsoft.com/dotnet/core/diagnostics/metrics-instrumentation) +named **`GoogleMapsApi`** (same name as the trace source). Like the tracing, the instrumentation is inert until a +listener subscribes. With OpenTelemetry, add the meter by name (the constant `GoogleMapsApi.Diagnostics.GoogleMapsMetrics.MeterName`): + +``` C# +using OpenTelemetry.Metrics; +using GoogleMapsApi.Diagnostics; + +builder.Services.AddOpenTelemetry().WithMetrics(metrics => metrics + .AddMeter(GoogleMapsMetrics.MeterName) // "GoogleMapsApi" + .AddOtlpExporter()); +``` + +| Instrument | Type | Unit | Meaning | +| --- | --- | --- | --- | +| `gmaps.client.requests` | counter | `{request}` | Calls attempted (success **and** failure) | +| `gmaps.client.request.duration` | histogram | `s` | Call latency in seconds | +| `gmaps.client.request.errors` | counter | `{error}` | Calls that threw (transport/HTTP failures) | + +All three carry the `gmaps.api` tag; requests/duration also carry `http.request.method`, and the duration histogram +adds `http.response.status_code` and `gmaps.response_status` when available. The error counter adds `error.type`. + +> Google "business" statuses (`ZERO_RESULTS`, `OVER_QUERY_LIMIT`, …) arrive as `200 OK` and **do not** count as +> errors — they surface via the `gmaps.response_status` tag on the duration histogram. + --- *If this library saved you time, please ⭐ the repo — it helps others find it.*