Skip to content

Loki log handler for Grafana#117

Open
Kalu32k wants to merge 16 commits intomainfrom
issue/92
Open

Loki log handler for Grafana#117
Kalu32k wants to merge 16 commits intomainfrom
issue/92

Conversation

@Kalu32k
Copy link
Contributor

@Kalu32k Kalu32k commented Dec 17, 2025

Feature - Loki (Logs)

Co-authors:

@MagnusTerak
@nilsishome

Merge

This PR must be merged after #95

Description

This PR covers the implementation of Loki.

Motivation and Context

This feature adds the ability to check logs in our grafana view for easier troubleshooting.

How Has This Been Tested?

Manual testing, you can access Grafana with port 3000 and login with
username: admin
password: admin

see "Telemetry manual.md" for more information

Graph_White

Screenshots:

image

Related Issues:

Loki #92

Summary by CodeRabbit

Release Notes

  • Documentation

    • Added Telemetry Manual with guidance on accessing and using the application's observability features.
  • New Features

    • Introduced complete observability platform including metrics collection, log aggregation, and dashboard visualization.
    • Access live monitoring dashboards at localhost:3000 to view system metrics and logs.
    • Added automatic request trace ID tracking for enhanced visibility.

✏️ Tip: You can customize this high-level summary in your review settings.

MagnusTerak and others added 16 commits December 8, 2025 14:49
Co-authored-by: Kalu32k <158482431+Kalu32k@users.noreply.github.com>
Co-authored-by: nilsishome <181159296+nilsishome@users.noreply.github.com>
# Conflicts:
#	backend/pom.xml
#	backend/src/main/java/org/fungover/zipp/security/SecurityConfig.java
Co-authored-by: Kalu32k <158482431+Kalu32k@users.noreply.github.com>
Co-authored-by: nilsishome <181159296+nilsishome@users.noreply.github.com>
Co-authored-by: Kalu32k <158482431+Kalu32k@users.noreply.github.com>
Co-authored-by: nilsishome <181159296+nilsishome@users.noreply.github.com>
Co-authored-by: MagnusTerak <49763392+magnusterak@users.noreply.github.com>
Co-authored-by: Kalu32k <158482431+kalu32k@users.noreply.github.com>
Co-authored-by: MagnusTerak <49763392+magnusterak@users.noreply.github.com>
Co-authored-by: Kalu32k <158482431+kalu32k@users.noreply.github.com>
…i volume paths, and enable anonymous Grafana access.

Co-authored-by:
Magnus <49763392+magnusterak@users.noreply.github.com>
Co-authored-by: <181159296+nilsishome@users.noreply.github.com>
…i volume paths, and enable anonymous Grafana access.

Co-authored-by: Magnus <49763392+magnusterak@users.noreply.github.com>
Co-authored-by: nilsishome <181159296+nilsishome@users.noreply.github.com>
Co-authored-by: Magnus <49763392+magnusterak@users.noreply.github.com>
Co-authored-by: nilsishome <181159296+nilsishome@users.noreply.github.com>
@coderabbitai
Copy link

coderabbitai bot commented Dec 17, 2025

Walkthrough

Introduces comprehensive observability infrastructure integrating OpenTelemetry, Prometheus, Loki, and Grafana. Adds configuration files, Spring Boot dependencies, and telemetry collection beans to enable metrics, logs, and trace visualization for the application.

Changes

Cohort / File(s) Summary
Documentation
Telemetry manual.md
New telemetry user guide documenting access, login, usage, and logout procedures
Observability Infrastructure Configuration
backend/config/grafana/provisioning/dashboards/dashboards.yaml, backend/config/grafana/provisioning/datasources/datasources.yaml, backend/config/loki/loki-config.yaml, backend/config/otel/otel-collector-config.yaml, backend/config/prometheus/prometheus-config.yaml
New provisioning and configuration files for Grafana datasources/dashboards, Loki log aggregation, OpenTelemetry Collector signal processing, and Prometheus metrics scraping
Backend Observability Integration
backend/src/main/java/org/fungover/zipp/opentelemetry/ContextPropagationConfiguration.java, backend/src/main/java/org/fungover/zipp/opentelemetry/InstallOpenTelemetryAppender.java, backend/src/main/java/org/fungover/zipp/opentelemetry/TraceIdFilter.java
New Spring configuration and filter classes for context propagation, OpenTelemetry appender initialization, and trace ID injection in responses
Backend Configuration & Logging
backend/src/main/resources/application-dev.yml, backend/src/main/resources/application.properties, backend/src/main/resources/logback-spring.xml
Spring Boot management endpoint exposure, OTLP logging endpoint configuration, and logback appender setup for OpenTelemetry integration
Backend Dependencies
backend/pom.xml
Added OpenTelemetry Spring Boot starter, actuator, Prometheus micrometer registry, logback appender, and protobuf dependencies
Docker Compose
docker-compose.yml
New observability stack services including otel-collector, prometheus, loki, grafana, and node-exporter with volumes, networking, and resource constraints

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

  • Areas requiring attention:
    • backend/src/main/java/org/fungover/zipp/opentelemetry/TraceIdFilter.java — Verify filter chain integration and trace ID extraction logic
    • backend/config/otel/otel-collector-config.yaml — Validate OTLP receiver configuration, resource detection processors, and exporter targets
    • backend/config/loki/loki-config.yaml — Complex configuration with schema, storage, and attribute processing rules; confirm OTLP attribute mappings match expectations
    • docker-compose.yml — Large addition with multiple new services; verify port mappings, volume mounts, environment variable substitution, and inter-service networking
    • backend/pom.xml — Confirm dependency versions and scopes are compatible with existing Spring Boot version
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch issue/92

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Kalu32k Kalu32k linked an issue Dec 17, 2025 that may be closed by this pull request
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (7)
Telemetry manual.md (1)

1-15: Enhance documentation with credentials and usage details.

The manual is missing the default credentials mentioned in the PR description (username: admin, password: admin). Additionally, consider adding more specific guidance on:

  • Which dashboards are available
  • How to navigate to logs/metrics/traces
  • What data sources are configured

Apply this diff to add the missing credentials:

 # Telemetry Manual
 Basic instructions on how to use the telemetry app.
 
 You can access the app by typing in localhost:3000 into your browser URL.
 If you want to have admin access you need to login with credentials with
 instructions below.
 
 ## How to login
-Type in localhost:3000/login into your browser URL and login with your credentials.
+Type in localhost:3000/login into your browser URL and login with the following credentials:
+- **Username:** admin
+- **Password:** admin
backend/src/main/resources/application.properties (1)

21-21: Clarify property placeholder naming for protocol.

The placeholder ${https:http} uses "https" as the property name with "http" as the default. While this works, the naming is confusing—it suggests the protocol is "https" when actually defaulting to "http". Consider a clearer name like ${OTLP_PROTOCOL:http} for better readability.

-management.opentelemetry.logging.export.otlp.endpoint=${https:http}://${zipp.city:localhost}:4318/v1/logs
+management.opentelemetry.logging.export.otlp.endpoint=${OTLP_PROTOCOL:http}://${OTEL_HOST:localhost}:4318/v1/logs
backend/config/grafana/provisioning/datasources/datasources.yaml (1)

1-14: Add apiVersion for proper Grafana provisioning.

Grafana datasource provisioning files typically require apiVersion: 1 at the root level. Also, the commented auth_enabled: true on line 13 is a Loki server configuration option, not a Grafana datasource property—consider removing it to avoid confusion.

+apiVersion: 1
+
 datasources:
   - name: Prometheus
     type: prometheus
     access: proxy
     url: http://prometheus:9090
     isDefault: true
 
   - name: Loki
     type: loki
     access: proxy
     url: http://loki:3100
     isDefault: false
-#    auth_enabled: true
backend/config/otel/otel-collector-config.yaml (2)

8-15: Consider restricting CORS origins for production.

The wildcard "*" for allowed_origins is acceptable for local development but should be restricted to specific origins in production to prevent unauthorized cross-origin requests.


39-40: Debug exporter verbosity may be noisy in production.

verbosity: detailed can generate significant log output. Consider reducing to normal or basic for production, or making it configurable via environment variable.

docker-compose.yml (2)

144-145: Pin node-exporter image version for reproducibility.

Using latest tag can lead to unexpected behavior when the image updates. Pin to a specific version for reproducible builds.

-    image: prom/node-exporter:latest
+    image: prom/node-exporter:v1.9.0

136-139: Anonymous Editor access is permissive for Grafana.

GF_AUTH_ANONYMOUS_ORG_ROLE: "Editor" allows unauthenticated users to modify dashboards. For a more secure default, consider using Viewer for read-only access, especially if this configuration might be used beyond local development.

       GF_AUTH_ANONYMOUS_ENABLED: "true"
-      GF_AUTH_ANONYMOUS_ORG_ROLE: "Editor"   # or Viewer for read-only
+      GF_AUTH_ANONYMOUS_ORG_ROLE: "Viewer"
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 75e16a3 and da56c36.

📒 Files selected for processing (14)
  • Telemetry manual.md (1 hunks)
  • backend/config/grafana/provisioning/dashboards/dashboards.yaml (1 hunks)
  • backend/config/grafana/provisioning/datasources/datasources.yaml (1 hunks)
  • backend/config/loki/loki-config.yaml (1 hunks)
  • backend/config/otel/otel-collector-config.yaml (1 hunks)
  • backend/config/prometheus/prometheus-config.yaml (1 hunks)
  • backend/pom.xml (1 hunks)
  • backend/src/main/java/org/fungover/zipp/opentelemetry/ContextPropagationConfiguration.java (1 hunks)
  • backend/src/main/java/org/fungover/zipp/opentelemetry/InstallOpenTelemetryAppender.java (1 hunks)
  • backend/src/main/java/org/fungover/zipp/opentelemetry/TraceIdFilter.java (1 hunks)
  • backend/src/main/resources/application-dev.yml (1 hunks)
  • backend/src/main/resources/application.properties (1 hunks)
  • backend/src/main/resources/logback-spring.xml (1 hunks)
  • docker-compose.yml (1 hunks)
🧰 Additional context used
🪛 LanguageTool
Telemetry manual.md

[grammar] ~14-~14: Ensure spelling is correct
Context: ...down to see metrics and logs ## How to logout Type in localhost:3000/logout into your ...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

🔇 Additional comments (13)
backend/src/main/java/org/fungover/zipp/opentelemetry/InstallOpenTelemetryAppender.java (1)

8-21: LGTM!

The implementation correctly wires OpenTelemetry into the logback appender during Spring initialization. The use of InitializingBean, constructor injection, and package-private visibility are all appropriate design choices.

backend/config/prometheus/prometheus-config.yaml (1)

1-35: LGTM!

The Prometheus configuration correctly defines scrape jobs for the observability stack components (Prometheus self-monitoring, node-exporter, and otel-collector) with appropriate targets and a reasonable scrape interval.

backend/src/main/java/org/fungover/zipp/opentelemetry/ContextPropagationConfiguration.java (1)

7-13: LGTM!

The configuration correctly exposes a ContextPropagatingTaskDecorator bean to enable context propagation across async tasks. The use of proxyBeanMethods = false is a good optimization for configuration classes that don't need inter-bean method calls.

backend/src/main/java/org/fungover/zipp/opentelemetry/TraceIdFilter.java (1)

15-37: LGTM!

The filter correctly retrieves the current trace ID from Micrometer's tracer and adds it to the response headers for correlation. The implementation is null-safe, uses appropriate base class (OncePerRequestFilter), and maintains clean separation of concerns with the private helper method.

backend/src/main/resources/logback-spring.xml (1)

1-12: LGTM!

The logback configuration correctly integrates the OpenTelemetry appender alongside the console appender, enabling logs to be exported to the observability stack while maintaining local console output.

backend/pom.xml (2)

193-197: No action needed. Version 4.33.2 is the latest stable release of protobuf-java according to Maven Central Repository.


187-191: Document the alpha dependency requirement and establish a migration plan.

The dependency opentelemetry-logback-appender-1.0 uses version 2.21.0-alpha, a pre-release version. While this appears intentional (marked in the pom.xml), there is no documentation explaining why the alpha version is required instead of a stable release, nor is there a migration plan to track when a stable version can be adopted. Add a comment in pom.xml explaining the requirement and create a tracking issue for migrating to a stable release once available.

backend/config/grafana/provisioning/dashboards/dashboards.yaml (1)

1-10: LGTM!

The dashboard provisioning configuration is well-structured and correctly points to /var/lib/grafana/dashboards, which aligns with the volume mount in docker-compose.yml.

backend/config/otel/otel-collector-config.yaml (1)

29-56: LGTM on pipeline configuration.

The metrics and logs pipelines are well-structured with appropriate receivers, processors, and exporters. The comment clarifying the OTLP HTTP endpoint behavior (appending /v1/logs) is helpful.

backend/config/loki/loki-config.yaml (2)

46-51: Dropping all log attributes may lose valuable context.

The configuration extracts user.id to structured metadata but then drops all other log attributes with regex: .*. This could discard useful debugging information like exception details, request IDs, or custom log fields.

Consider being more selective or using structured_metadata for additional attributes you want to query.


1-12: Configuration suitable for single-instance development.

Using auth_enabled: false and inmemory ring store is appropriate for local development. For production, you would need to enable authentication and use a persistent ring store (e.g., consul, etcd).

docker-compose.yml (2)

77-84: Redundant port environment variable definitions.

The port variables are defined in the environment section but also referenced in ports. While this works, the environment variables are only needed if the otel-collector config references them (which it does for OTEL_COLLECTOR_PORT_PROM). Consider documenting that these must also be set externally (e.g., in .env) for the host port mappings to resolve correctly.


62-84: Running otel-collector as root.

user: 0:0 runs the collector as root, which is needed for host metrics access. This is acceptable for development but consider documenting this security trade-off.

Comment on lines +25 to +29
management:
endpoints:
web:
exposure:
include: '*'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Limit exposed management endpoints to only those needed.

Exposing all management endpoints (*) includes sensitive endpoints such as env, configprops, heapdump, threaddump, and others that can leak configuration secrets, environment variables, or memory dumps. Even in a dev profile, this creates unnecessary risk if the dev environment is network-accessible.

Apply this diff to expose only the endpoints required for observability:

 management:
   endpoints:
     web:
       exposure:
-        include: '*'
+        include: 'health,info,metrics,prometheus'
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
management:
endpoints:
web:
exposure:
include: '*'
management:
endpoints:
web:
exposure:
include: 'health,info,metrics,prometheus'
🤖 Prompt for AI Agents
In backend/src/main/resources/application-dev.yml around lines 25 to 29, the
management endpoints are currently exposed with '*' which leaks sensitive
endpoints; replace the wildcard with an explicit list of only the observability
endpoints you need (for example: health, info, metrics, prometheus, httptrace or
whichever your monitoring stack requires) and ensure sensitive endpoints like
env, configprops, heapdump, threaddump, etc. are not included; update the YAML
to include only those specific keys and verify the dev profile still exposes
necessary monitoring data but not full actuator surface.

Comment on lines +109 to +122
loki:
image: grafana/loki:latest
container_name: loki
ports:
- "3100:3100"
networks:
- app-net
volumes:
- loki-data:/loki
- ./backend/config/loki/loki-config.yaml:/etc/loki/loki-config.yaml
command:
- --config.expand-env=true
- --config.file=/etc/loki/loki-config.yaml

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Pin Loki image version and add missing configuration.

The Loki service uses latest tag instead of a pinned version. The latest stable version is v3.6.3, though v3.5.x remains stable. Additionally, the service is missing restart: unless-stopped policy and memory limits that other services have, and has inconsistent YAML indentation.

   loki:
-      image: grafana/loki:latest
-      container_name: loki
-      ports:
-        - "3100:3100"
-      networks:
-        - app-net
-      volumes:
-        - loki-data:/loki
-        - ./backend/config/loki/loki-config.yaml:/etc/loki/loki-config.yaml
-      command:
-        - --config.expand-env=true
-        - --config.file=/etc/loki/loki-config.yaml
+    image: grafana/loki:3.6.3
+    container_name: loki
+    restart: unless-stopped
+    deploy:
+      resources:
+        limits:
+          memory: 256M
+    ports:
+      - "3100:3100"
+    networks:
+      - app-net
+    volumes:
+      - loki-data:/loki
+      - ./backend/config/loki/loki-config.yaml:/etc/loki/loki-config.yaml
+    command:
+      - --config.expand-env=true
+      - --config.file=/etc/loki/loki-config.yaml
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
loki:
image: grafana/loki:latest
container_name: loki
ports:
- "3100:3100"
networks:
- app-net
volumes:
- loki-data:/loki
- ./backend/config/loki/loki-config.yaml:/etc/loki/loki-config.yaml
command:
- --config.expand-env=true
- --config.file=/etc/loki/loki-config.yaml
loki:
image: grafana/loki:3.6.3
container_name: loki
restart: unless-stopped
deploy:
resources:
limits:
memory: 256M
ports:
- "3100:3100"
networks:
- app-net
volumes:
- loki-data:/loki
- ./backend/config/loki/loki-config.yaml:/etc/loki/loki-config.yaml
command:
- --config.expand-env=true
- --config.file=/etc/loki/loki-config.yaml
🤖 Prompt for AI Agents
In docker-compose.yml around lines 109 to 122, the Loki service uses the
unpinned "latest" image, lacks a restart policy and memory limits, and shows
inconsistent YAML indentation; update the image to a pinned stable tag (e.g.,
grafana/loki:v3.6.3 or v3.5.x), add restart: unless-stopped, and add resource
limits consistent with other services (e.g., deploy.resources.limits.memory:
512M or the same value used elsewhere), and fix indentation to match the compose
file's service block style so keys align with other services.

Copy link

@0x0xoscar 0x0xoscar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well written and easy to follow. The configuration is clear and easy to understand, and the feature makes troubleshooting easier by exposing logs directly in Grafana.
Everything looks good to me, well done!

Copy link
Contributor

@AlexCode-dot AlexCode-dot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! The setup is easy to follow, and the documentation clearly explains how to access and use Grafana and Loki. Having logs directly available in Grafana will definitely make troubleshooting smoother.

Nice work overall 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Loki (Logs)

5 participants