[FEAT] Dashboard metrics system #1724

athul-rs · 2025-12-29T19:31:42Z

What

Add dashboard metrics Django app with multi-tier storage (hourly/daily/monthly tables)
Add core metrics library with queue backend for event capture
Add dedicated metrics Celery worker (worker-metrics)
Add live query endpoints for immediate data access from source tables
Add production readiness features (retry config, rate limiting, health checks)

Why

Enable tracking and visualization of platform usage metrics
Provide real-time insights into documents processed, LLM calls, API requests, etc.
Support dashboard reporting without impacting main application performance

How

Created backend/dashboard_metrics/ Django app with models, views, services, and tasks
Created unstract/core/src/unstract/core/metrics/ library with queue backend using Kombu
Used celery-batches for efficient batch processing of metric events
Added dedicated Celery worker consuming from dashboard_metric_events queue
Implemented live query endpoints that read directly from source tables (usage, page_usage, workflow_execution)

Can this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)

No, this is an entirely new feature with no modifications to existing functionality
New Django app is isolated and only adds new API endpoints
New Celery worker runs on a separate queue

Database Migrations

0001_initial: Creates EventMetricsHourly table
0002_setup_cleanup_periodic_task: Adds periodic cleanup task
0003_add_daily_monthly_tables: Creates EventMetricsDaily and EventMetricsMonthly tables
0004_update_cleanup_periodic_tasks: Adds daily cleanup periodic task

Env Config

WORKER_METRICS_AUTOSCALE (optional): Autoscale config for metrics worker (default: 4,1)
DASHBOARD_METRICS_ENABLED (optional): Enable/disable metrics capture

Relevant Docs

N/A

Related Issues or PRs

UN-1798

Dependencies Versions

celery-batches>=0.8.0 (new dependency)

Notes on Testing

Run unit tests: pytest backend/dashboard_metrics/tests/
Verify worker starts: docker compose up worker-metrics
Test health endpoint: GET /api/v2/metrics/health/
Verify metrics queue routing in RabbitMQ

Screenshots

N/A

Checklist

I have read and understood the Contribution Guidelines.

coderabbitai · 2025-12-29T19:31:55Z

Summary by CodeRabbit

New Features
- Full Dashboard Metrics: APIs, background processing, management commands, and an interactive metrics UI at /metrics/ (overview, charts, recent activity).
Performance
- Time-bucketed caching with cache warming/invalidation, background aggregation, and retention/cleanup jobs for faster queries.
Tools & Data
- Dedicated metrics worker, periodic aggregation/cleanup scheduling, backfill and test-data generators.
Frontend
- New dashboard components: summary cards, charts, recent activity, routes and local TTL caching.
Docs
- Added Dashboard Metrics README.
Tests
- Unit and integration tests for processing and aggregation.

Walkthrough

Adds a Dashboard Metrics feature: new Django app with models, migrations, Celery tasks and worker, Redis-backed bucket cache, a core metrics API and queue backend, services and management commands, extensive tests, frontend dashboard UI/components/hooks/styles, and runtime tooling (docker, poe, dependencies).

Changes

Cohort / File(s)	Summary
Celery config & runtime `backend/backend/celery_config.py`, `backend/pyproject.toml`, `docker/docker-compose.yaml`	Adds broker resilience and transport options, worker settings, defines `dashboard_metric_events` queue and task routes, adds `celery-batches` dependency, poe worker task, and a new `worker-metrics` docker service.
Django app registration & settings `backend/backend/settings/base.py`, `backend/dashboard_metrics/apps.py`, `backend/dashboard_metrics/__init__.py`	Registers `dashboard_metrics` app and adds dashboard cache TTL settings.
Routing `backend/backend/urls_v2.py`, `backend/dashboard_metrics/urls.py`	Mounts metrics app at `/metrics/` and defines viewset routes (list, detail, summary, series, overview, live_*, health, recent_activity).
Models & migrations `backend/dashboard_metrics/models.py`, `backend/dashboard_metrics/migrations/*`	Adds MetricType and three time-bucket models (hourly/daily/monthly) with indexes, GIN on labels, unique constraints; migrations create tables and schedule aggregation/cleanup periodic tasks.
Caching layer `backend/dashboard_metrics/cache.py`	New Redis-backed bucketed cache: time-aware TTLs, key building, mget/mset for buckets, decorator, warm/invalidate helpers, and cache stats with Redis fallbacks.
Metric capture & core framework `backend/dashboard_metrics/capture.py`, `unstract/core/src/unstract/core/metrics/*`	Adds MetricsCapture (opt-in), core metrics API/types/registry, backends (Noop/Queue), backend config/selection, and queue publishing logic.
Query/service layer `backend/dashboard_metrics/services.py`	Adds MetricsQueryService to query and aggregate metrics from source tables with multiple getters and summary aggregation.
Celery tasks & processing `backend/dashboard_metrics/tasks.py`	Adds Celery tasks for batched event ingestion, aggregation from sources, retention cleanup, and bulk upsert/upsert-replace helpers for hourly/daily/monthly.
API serializers & views `backend/dashboard_metrics/serializers.py`, `backend/dashboard_metrics/views.py`	Adds request/response serializers and DashboardMetricsViewSet with throttling, source selection, bucket-cache-aware hourly fetch, health checks, and endpoints.
Management commands & tests `backend/dashboard_metrics/management/commands/`, `backend/dashboard_metrics/tests/`	Adds commands to backfill, generate/populate test metrics and extensive task/unit tests for processing and cleanup.
Frontend deps, routes & components `frontend/package.json`, `frontend/src/routes/useMainAppRoutes.js`, `frontend/src/pages/MetricsDashboardPage.jsx`, `frontend/src/components/metrics-dashboard/*`, `frontend/src/hooks/useMetricsData.js`, `frontend/src/helpers/metricsCache.js`, `frontend/src/components/navigations/side-nav-bar/SideNavBar.jsx`	Adds `recharts` dependency, routes for dashboard, LocalStorage TTL cache helpers, hooks, new dashboard components (summary, charts, recent activity), styles, and side-nav item.
Docs `backend/dashboard_metrics/README.md`	New README documenting the dashboard metrics module and architecture.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Frontend Client
    participant API as DashboardMetricsViewSet
    participant Cache as Redis Cache
    participant Service as MetricsQueryService
    participant DB as Metrics DB

    Client->>API: GET /metrics/summary?org=X&start=Y&end=Z
    API->>API: _select_source(params)
    alt hourly source & bucket-cache enabled
        API->>Cache: mget_metrics_buckets(org, range)
        Cache-->>API: cached buckets + missing list
        API->>Service: query missing buckets
        Service->>DB: fetch missing rows
        DB-->>Service: rows
        Service-->>API: rows
        API->>Cache: mset_metrics_buckets(new buckets)
    else daily/monthly or full cache miss
        API->>Service: query aggregated data
        Service->>DB: fetch aggregates
        DB-->>Service: aggregates
        Service-->>API: aggregates
    end
    API->>Client: HTTP 200 MetricsResponse

sequenceDiagram
    participant Producer as Integration Point
    participant Capture as MetricsCapture
    participant Core as core.metrics.record
    participant Backend as QueueBackend
    participant Broker as RabbitMQ
    participant Worker as worker-metrics
    participant Task as process_dashboard_metric_events
    participant DB as Metrics DB

    Producer->>Capture: record_*(...)
    Capture->>Core: record(org_id, metric_name, value,...)
    Core->>Backend: backend.record(MetricEvent)
    Backend->>Broker: publish Celery-style message to queue
    Broker->>Worker: deliver to dashboard_metric_events
    Worker->>Task: process batch of events
    Task->>DB: bulk upsert EventMetricsHourly/Daily/Monthly
    DB-->>Task: upsert results

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title '[FEAT] Dashboard metrics system' clearly and concisely summarizes the main change—adding a new dashboard metrics feature with multi-tier storage and API endpoints.
Description check	✅ Passed	The PR description covers all required template sections: What (feature overview), Why (motivation), How (implementation approach), impact assessment, database migrations, environment config, related issues, dependencies, and testing notes. All critical sections are completed.
Docstring Coverage	✅ Passed	Docstring coverage is 90.59% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch UN-1798-simple-dashboard

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 8

Fix all issues with AI Agents 🤖

In @backend/dashboard_metrics/capture.py:
- Around line 51-338: MetricsCapture methods call record() with the metric enum
as the first positional arg and keyword names like org_id= and value=, but core
record signature expects org_id first, metric_name second and metric_value
third, so change every call in record_api_request, record_etl_execution,
record_document_processed (both DOCUMENTS_PROCESSED and PAGES_PROCESSED),
record_llm_call (LLM_CALLS and LLM_USAGE), record_challenge,
record_summarization, and record_prompt_execution to match the core signature:
call record(org_id, MetricName.<X>, <metric_value>, labels=metric_labels,
project=project, tag=None) (use 1 for single-count events, pages for pages, cost
for LLM_USAGE as float), and remove the invalid value= keyword and duplicate
org_id keyword usage.

In @backend/dashboard_metrics/urls.py:
- Around line 9-15: The URL configuration is missing a mapping for the
DashboardMetricsViewSet.health action; add a new view mapping similar to the
others (e.g., create metrics_health = DashboardMetricsViewSet.as_view({"get":
"health"})) and include it in the URL patterns so GET /api/v2/metrics/health/ is
exposed; update any import/urlpatterns list references where other metrics_*
views are registered to include metrics_health.

In @backend/dashboard_metrics/views.py:
- Around line 267-273: Both live_summary and live_series call
UserContext.get_organization() and access .id directly; add the same guard used
in get_queryset(): if organization is None (or falsy) log an explanatory message
and raise PermissionDenied before accessing organization.id, then pass org_id to
MetricsQueryService.get_all_metrics_summary / get_all_metrics_series; apply the
same check for the other occurrence around lines referenced (the block using
UserContext.get_organization() / organization.id at ~310-312).

In @unstract/core/src/unstract/core/metrics/types.py:
- Around line 35-48: Add a unit test that ensures every value of the MetricName
enum has a corresponding key in METRIC_TYPE_MAP to prevent KeyError in
get_metric_type(); specifically, iterate over MetricName (or
MetricName.__members__.values()/MetricName) and assert that each .value is
present in METRIC_TYPE_MAP, referencing METRIC_TYPE_MAP and MetricName; this
complements validate_metric() and prevents missing entries that would cause
get_metric_type() to fail.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

backend/pyproject.toml (1)

15-21: celery-batches version too old—use >=0.9.0 for Celery 5.3.4 and Python 3.12 support

The constraint celery-batches>=0.8.0 pins to an outdated release (Jun 2023). For Celery 5.3.4 and Python 3.12, use celery-batches>=0.9.0, which is documented to support Celery ~5.0 and Python 3.9–3.13. Version 0.8.0 predates Python 3.12 and lacks testing against current Celery 5.x releases.

🧹 Nitpick comments (16)

backend/dashboard_metrics/tasks.py (2)

37-75: Time bucketing helpers look correct; naive datetimes are assumed UTC

The truncation helpers correctly normalize numeric timestamps to UTC and bucket to hour/day/month. Note that naive datetimes are implicitly treated as UTC; ensure all producers either send UTC or explicitly aware datetimes so you don’t silently mis-bucket mixed timezones.

77-206: Batched aggregation logic is solid; consider tightening event validation

The batching and in-memory aggregation into hourly/daily/monthly buckets looks good and aligns with celery-batches’ request shape. Two minor points to consider:

event = request.args[0] if request.args else request.kwargs.get("event", {}) assumes the first arg is always a dict; if other producers appear later, a simple isinstance(..., dict) guard could avoid noisy warnings.

timestamp is expected as float/datetime; if there’s any chance of ISO strings coming through, you may want to normalize them explicitly rather than relying on a generic except Exception to skip them.

Nothing blocking here; more of a future-hardening note.

backend/backend/urls_v2.py (1)

63-67: Metrics URL include is consistent with existing routing

Mounting dashboard_metrics.urls at "metrics/" follows the same pattern as other feature areas (e.g., "usage/", "notifications/"). Just ensure any external docs or clients expecting /api/v2/metrics/... are updated to match this base path if needed.

docker/docker-compose.yaml (1)

120-138: worker-metrics service wiring looks consistent with existing workers

The new worker-metrics service mirrors the existing Celery worker patterns (image, entrypoint, queue, autoscale, env) and targets the dashboard_metric_events queue as intended. If you expect metrics tasks to need shared workflow data or tool configs, consider whether it should also mount the same volumes as other workers; otherwise this setup looks good.

backend/pyproject.toml (1)

145-148: worker-metrics Poe task aligns with docker-compose configuration

The worker-metrics Poe task mirrors the docker-compose command (same app, log level, queue, autoscale), which is helpful for running the metrics worker locally outside Docker. Keep this in sync with any future queue or autoscale changes to avoid drift between CLI and compose.
backend/dashboard_metrics/tests/test_tasks.py (1)
149-164: Consider renaming unused loop variable.

The loop variable i is not used within the loop body. Following Python convention, consider renaming it to _ to indicate it's intentionally unused.
🔎 Proposed refactor
-        for i in range(5):
+        for _ in range(5):
             mock_request = MagicMock()
unstract/core/src/unstract/core/metrics/__init__.py (1)
113-115: Use logging.exception to preserve the traceback.

When catching exceptions during event creation, using logger.exception() instead of logger.error() will include the full traceback, which aids debugging.
🔎 Suggested fix
     except Exception as e:
-        logger.error(f"Failed to create metric event: {e}")
+        logger.exception(f"Failed to create metric event: {e}")
         return False
unstract/core/src/unstract/core/metrics/registry.py (1)
42-48: Non-deterministic ordering in get_all_metrics().

The function returns list(_REGISTERED_METRICS) where _REGISTERED_METRICS is a set. While this works, the order of returned metrics may vary between calls. If consistent ordering is needed for API responses or testing, consider sorting the result.
🔎 Optional fix for deterministic ordering
 def get_all_metrics() -> list[str]:
     """Get all registered metric names.

     Returns:
         List of all registered metric names
     """
-    return list(_REGISTERED_METRICS)
+    return sorted(_REGISTERED_METRICS)
backend/dashboard_metrics/migrations/0004_update_cleanup_periodic_tasks.py (1)

38-57: Clarify naming: task cleans "daily" aggregates on a weekly schedule.

The comment says "weekly cleanup task for daily metrics" but the task is named dashboard_metrics_cleanup_daily. This is technically correct (it cleans data from the daily aggregation table on a weekly schedule), but the naming could cause confusion. Consider renaming to dashboard_metrics_cleanup_daily_weekly or updating the description to clarify it runs weekly to clean daily-aggregated data.

unstract/core/src/unstract/core/metrics/types.py (1)

85-96: Consider defensive error handling in from_dict.

If required keys (org_id, metric_name, metric_value, metric_type) are missing from data, a KeyError will be raised. While the caller should ensure valid data, adding a brief check or using .get() with explicit validation could provide clearer error messages during deserialization failures.
unstract/core/src/unstract/core/metrics/backends/queue.py (1)
151-156: Use logger.exception for error logging with traceback.

Replace logger.error with logger.exception in the exception handlers to include stack traces, which aids debugging production issues.
🔎 Suggested fix
         except KombuError as e:
-            logger.error(f"Failed to queue metric {event.metric_name}: {e}")
+            logger.exception(f"Failed to queue metric {event.metric_name}: {e}")
             return False
         except Exception as e:
-            logger.error(f"Unexpected error queuing metric: {e}")
+            logger.exception(f"Unexpected error queuing metric: {e}")
             return False
backend/dashboard_metrics/views.py (1)
335-369: Improve error handling and observability in live_series metric loop.

Catching a blanket Exception for each metric is reasonable to keep the endpoint partially available, but the current logging (logger.error) loses the traceback and makes debugging failures harder.

Switching to logger.exception here will preserve the stack trace while keeping behavior the same.
Proposed logging tweak
-            except Exception as e:
-                logger.error(f"Failed to fetch metric {metric_name}: {e}")
+            except Exception:
+                logger.exception("Failed to fetch metric %s", metric_name)
                 errors.append(metric_name)
backend/dashboard_metrics/cache.py (1)

38-69: Time‑aware TTL helper is well‑designed but currently unused.

get_time_aware_cache_ttl() correctly distinguishes queries that include the current hour from purely historical ones, which would be useful to avoid over‑caching near‑real‑time data. Right now it isn’t wired into cache_metrics_response, so all endpoints rely on static TTLs.

Consider integrating get_time_aware_cache_ttl() into the decorator (e.g., deriving TTL from a parsed end_date param) for summary/series endpoints that query recent data.
backend/dashboard_metrics/models.py (3)
14-19: Avoid duplicating MetricType that already exists in core metrics

There’s already a MetricType enum in unstract/core/src/unstract/core/metrics/types.py with the same values ("counter", "histogram"). Having two different MetricType definitions (one Enum, one TextChoices) increases the chance of divergence or confusion.

Consider either:

Reusing the core MetricType as the single source of truth and mapping its values into a TextChoices, or

Renaming this one (e.g., DashboardMetricType) to make the separation explicit.

21-37: Manager inheritance can be simplified

DefaultOrganizationManagerMixin already subclasses models.Manager (per backend/utils/models/organization_mixin.py), so inheriting from both is redundant:
class EventMetricsHourlyManager(DefaultOrganizationManagerMixin):
    ...
Same applies to EventMetricsDailyManager and EventMetricsMonthlyManager.

113-130: Address Ruff RUF012 on mutable Meta attributes (indexes, constraints)

Ruff is flagging indexes and constraints as mutable class attributes (RUF012). In Django Meta classes this is common, but if you want to satisfy the linter and improve type hints you can annotate them as ClassVar:
from typing import ClassVar
from django.db import models

class Meta:
    indexes: ClassVar[list[models.Index]] = [
        ...
    ]
    constraints: ClassVar[list[models.Constraint]] = [
        ...
    ]
You can apply the same pattern to the Meta classes of all three models, or alternatively disable RUF012 for Django Meta if you prefer the standard Django style.

Also applies to: 219-236, 325-342

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to Reviews > Disable Cache setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 3469b02 and 62b1c3b.

⛔ Files ignored due to path filters (1)

backend/uv.lock is excluded by !**/*.lock

📒 Files selected for processing (30)

backend/backend/celery_config.py
backend/backend/settings/base.py
backend/backend/urls_v2.py
backend/dashboard_metrics/__init__.py
backend/dashboard_metrics/apps.py
backend/dashboard_metrics/cache.py
backend/dashboard_metrics/capture.py
backend/dashboard_metrics/migrations/0001_initial.py
backend/dashboard_metrics/migrations/0002_setup_cleanup_periodic_task.py
backend/dashboard_metrics/migrations/0003_add_daily_monthly_tables.py
backend/dashboard_metrics/migrations/0004_update_cleanup_periodic_tasks.py
backend/dashboard_metrics/migrations/__init__.py
backend/dashboard_metrics/models.py
backend/dashboard_metrics/serializers.py
backend/dashboard_metrics/services.py
backend/dashboard_metrics/tasks.py
backend/dashboard_metrics/tests/__init__.py
backend/dashboard_metrics/tests/test_tasks.py
backend/dashboard_metrics/urls.py
backend/dashboard_metrics/views.py
backend/pyproject.toml
docker/docker-compose.yaml
unstract/core/src/unstract/core/metrics/__init__.py
unstract/core/src/unstract/core/metrics/backends/__init__.py
unstract/core/src/unstract/core/metrics/backends/base.py
unstract/core/src/unstract/core/metrics/backends/noop.py
unstract/core/src/unstract/core/metrics/backends/queue.py
unstract/core/src/unstract/core/metrics/config.py
unstract/core/src/unstract/core/metrics/registry.py
unstract/core/src/unstract/core/metrics/types.py

🧰 Additional context used

🧬 Code graph analysis (16)

unstract/core/src/unstract/core/metrics/backends/noop.py (3)

unstract/core/src/unstract/core/metrics/types.py (1)

MetricEvent (52-96)

unstract/core/src/unstract/core/metrics/backends/base.py (3)

AbstractMetricBackend (8-40)

record (15-24)

flush (27-32)

unstract/core/src/unstract/core/metrics/__init__.py (1)

record (58-119)

unstract/core/src/unstract/core/metrics/backends/base.py (4)

unstract/core/src/unstract/core/metrics/types.py (1)

MetricEvent (52-96)

unstract/core/src/unstract/core/metrics/__init__.py (1)

record (58-119)

unstract/core/src/unstract/core/metrics/backends/noop.py (2)

record (25-40)

flush (42-44)

unstract/core/src/unstract/core/metrics/backends/queue.py (3)

record (117-156)

flush (200-205)

close (207-215)

unstract/core/src/unstract/core/metrics/registry.py (2)

unstract/core/src/unstract/core/metrics/types.py (2)

MetricName (15-32)

MetricType (8-12)

backend/dashboard_metrics/models.py (1)

MetricType (14-18)

backend/dashboard_metrics/tasks.py (1)

backend/dashboard_metrics/models.py (4)

EventMetricsDaily (145-248)

EventMetricsHourly (39-142)

EventMetricsMonthly (251-354)

MetricType (14-18)

unstract/core/src/unstract/core/metrics/types.py (1)

backend/dashboard_metrics/models.py (1)

MetricType (14-18)

backend/dashboard_metrics/urls.py (1)

backend/dashboard_metrics/views.py (1)

DashboardMetricsViewSet (40-451)

backend/dashboard_metrics/views.py (4)

backend/permissions/permission.py (1)

IsOrganizationMember (17-22)

backend/utils/user_context.py (1)

UserContext (8-32)

backend/dashboard_metrics/cache.py (1)

cache_metrics_response (88-145)

backend/dashboard_metrics/services.py (2)

MetricsQueryService (24-447)

get_documents_processed (54-86)

backend/dashboard_metrics/models.py (3)

backend/utils/models/organization_mixin.py (2)

DefaultOrganizationManagerMixin (26-29)

DefaultOrganizationMixin (7-23)

unstract/core/src/unstract/core/metrics/types.py (1)

MetricType (8-12)

backend/dashboard_metrics/serializers.py (1)

Meta (110-124)

backend/dashboard_metrics/capture.py (5)

unstract/core/src/unstract/core/metrics/types.py (2)

MetricName (15-32)

MetricType (8-12)

unstract/core/src/unstract/core/metrics/__init__.py (1)

record (58-119)

unstract/core/src/unstract/core/metrics/backends/base.py (1)

record (15-24)

unstract/core/src/unstract/core/metrics/backends/noop.py (1)

record (25-40)

unstract/core/src/unstract/core/metrics/backends/queue.py (1)

record (117-156)

unstract/core/src/unstract/core/metrics/backends/queue.py (3)

unstract/core/src/unstract/core/metrics/types.py (1)

MetricEvent (52-96)

unstract/core/src/unstract/core/metrics/backends/base.py (4)

AbstractMetricBackend (8-40)

close (34-40)

record (15-24)

flush (27-32)

unstract/core/src/unstract/core/metrics/__init__.py (1)

record (58-119)

unstract/core/src/unstract/core/metrics/config.py (3)

unstract/core/src/unstract/core/metrics/backends/base.py (1)

AbstractMetricBackend (8-40)

unstract/core/src/unstract/core/metrics/backends/noop.py (1)

NoopBackend (14-44)

unstract/core/src/unstract/core/metrics/backends/queue.py (1)

QueueBackend (28-215)

backend/dashboard_metrics/serializers.py (2)

backend/dashboard_metrics/models.py (4)

EventMetricsHourly (39-142)

Meta (109-142)

Meta (215-248)

Meta (321-354)

backend/dashboard_metrics/views.py (2)

series (113-189)

summary (70-109)

backend/dashboard_metrics/migrations/0001_initial.py (1)

backend/dashboard_metrics/migrations/0003_add_daily_monthly_tables.py (1)

Migration (10-261)

unstract/core/src/unstract/core/metrics/backends/__init__.py (3)

unstract/core/src/unstract/core/metrics/backends/base.py (1)

AbstractMetricBackend (8-40)

unstract/core/src/unstract/core/metrics/backends/noop.py (1)

NoopBackend (14-44)

unstract/core/src/unstract/core/metrics/backends/queue.py (1)

QueueBackend (28-215)

backend/dashboard_metrics/migrations/0004_update_cleanup_periodic_tasks.py (2)

backend/dashboard_metrics/migrations/0002_setup_cleanup_periodic_task.py (1)

Migration (40-51)

backend/dashboard_metrics/migrations/0003_add_daily_monthly_tables.py (1)

Migration (10-261)

backend/dashboard_metrics/cache.py (3)

backend/utils/user_context.py (1)

UserContext (8-32)

backend/utils/cache_service.py (1)

clear_cache (41-43)

backend/dashboard_metrics/views.py (1)

overview (193-245)

🪛 Ruff (0.14.10)

unstract/core/src/unstract/core/metrics/backends/noop.py

25-25: Unused method argument: event

(ARG002)

unstract/core/src/unstract/core/metrics/__init__.py

39-53: __all__ is not sorted

Apply an isort-style sorting to __all__

(RUF022)

113-113: Do not catch blind exception: Exception

(BLE001)

114-114: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

backend/backend/celery_config.py

42-49: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

56-66: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

69-73: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

unstract/core/src/unstract/core/metrics/registry.py

38-38: Avoid specifying long messages outside the exception class

(TRY003)

backend/dashboard_metrics/migrations/0002_setup_cleanup_periodic_task.py

6-6: Unused function argument: schema_editor

(ARG001)

33-33: Unused function argument: schema_editor

(ARG001)

41-44: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

46-51: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

backend/dashboard_metrics/tasks.py

184-184: Do not catch blind exception: Exception

(BLE001)

257-257: Do not catch blind exception: Exception

(BLE001)

258-258: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

311-311: Do not catch blind exception: Exception

(BLE001)

312-312: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

365-365: Do not catch blind exception: Exception

(BLE001)

366-366: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

backend/dashboard_metrics/tests/test_tasks.py

149-149: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

backend/dashboard_metrics/views.py

47-47: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

48-48: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

50-50: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

51-51: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

52-52: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

60-60: Abstract raise to an inner function

(TRY301)

60-60: Avoid specifying long messages outside the exception class

(TRY003)

65-65: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

66-66: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

66-66: Avoid specifying long messages outside the exception class

(TRY003)

193-193: Unused method argument: request

(ARG002)

359-359: Do not catch blind exception: Exception

(BLE001)

360-360: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

403-403: Unused method argument: request

(ARG002)

436-436: Consider moving this statement to an else block

(TRY300)

437-437: Do not catch blind exception: Exception

(BLE001)

438-438: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

448-448: Consider moving this statement to an else block

(TRY300)

449-449: Do not catch blind exception: Exception

(BLE001)

450-450: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

backend/dashboard_metrics/models.py

113-130: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

131-142: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

219-236: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

237-248: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

325-342: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

343-354: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

backend/dashboard_metrics/capture.py

45-45: Consider moving this statement to an else block

(TRY300)

unstract/core/src/unstract/core/metrics/backends/queue.py

95-96: try-except-pass detected, consider logging the exception

(S110)

95-95: Do not catch blind exception: Exception

(BLE001)

101-101: Consider moving this statement to an else block

(TRY300)

103-103: Do not catch blind exception: Exception

(BLE001)

113-113: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

115-115: Avoid specifying long messages outside the exception class

(TRY003)

149-149: Consider moving this statement to an else block

(TRY300)

152-152: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

154-154: Do not catch blind exception: Exception

(BLE001)

155-155: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

212-212: Do not catch blind exception: Exception

(BLE001)

backend/dashboard_metrics/migrations/0003_add_daily_monthly_tables.py

11-14: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

16-261: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

backend/dashboard_metrics/serializers.py

112-123: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

backend/dashboard_metrics/migrations/0001_initial.py

13-15: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

17-136: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

backend/dashboard_metrics/migrations/0004_update_cleanup_periodic_tasks.py

9-9: Unused function argument: schema_editor

(ARG001)

60-60: Unused function argument: schema_editor

(ARG001)

69-72: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

74-79: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

backend/dashboard_metrics/cache.py

84-84: Probable use of insecure hash functions in hashlib: md5

(S324)

163-163: Consider moving this statement to an else block

(TRY300)

164-164: Do not catch blind exception: Exception

(BLE001)

165-165: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

252-252: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: build

🔇 Additional comments (28)

backend/dashboard_metrics/__init__.py (1)

1-4: LGTM!

Clean package initializer with clear documentation describing the Dashboard Metrics app's purpose.

backend/dashboard_metrics/migrations/__init__.py (1)

1-1: LGTM!

Standard migrations package initializer following Django conventions.

backend/backend/settings/base.py (1)

345-345: LGTM!

The dashboard_metrics app is correctly registered in SHARED_APPS, making it available across all organizations in the multi-tenant setup. This is appropriate for organization-scoped metrics collection and querying.

unstract/core/src/unstract/core/metrics/backends/__init__.py (1)

1-11: LGTM!

Clean public API definition for the metrics backends package. The __all__ declaration correctly matches the imported backends, providing clear control over the public interface.

backend/dashboard_metrics/apps.py (1)

6-11: LGTM!

Standard Django app configuration following Django 4.2+ best practices. The BigAutoField default is appropriate, and the app naming is correct.

backend/dashboard_metrics/tests/__init__.py (1)

1-1: Tests package init is fine

Docstring-only __init__ is harmless and documents the test package clearly.

unstract/core/src/unstract/core/metrics/backends/base.py (1)

1-40: LGTM! Clean abstract interface design.

The abstract base class follows SOLID principles with a minimal, focused interface. The concrete close() method with a default implementation calling flush() is a good design choice that simplifies backend implementations while allowing customization when needed.

unstract/core/src/unstract/core/metrics/backends/noop.py (1)

21-44: LGTM! Appropriate no-op implementation.

The NoopBackend correctly implements the abstract interface and provides helpful one-time logging to inform users that metrics are disabled. The event parameter in record() is required by the AbstractMetricBackend interface, so the static analysis hint (ARG002) is a false positive.

backend/dashboard_metrics/tests/test_tasks.py (1)

1-553: Excellent test coverage!

The test suite is comprehensive and well-organized, covering time truncation helpers, batch processing, bulk upsert operations, cleanup tasks, and full integration flows. Good use of fixtures, edge cases, and assertions to verify data integrity across all metrics tables.

backend/dashboard_metrics/migrations/0002_setup_cleanup_periodic_task.py (1)

1-51: LGTM! Migration follows Django conventions.

The migration correctly sets up a periodic cleanup task with proper forward and rollback operations. The static analysis hints (ARG001, RUF012) are false positives—the schema_editor parameter and migration class attributes follow Django's required conventions for data migrations.

unstract/core/src/unstract/core/metrics/config.py (1)

1-68: LGTM! Well-designed configuration module.

The environment-driven backend selection with singleton caching is clean and robust. The fallback to NoopBackend when metrics are disabled or misconfigured ensures safe defaults, and the logging at each decision point aids troubleshooting. The reset_backend() function is a good addition for testing scenarios.

unstract/core/src/unstract/core/metrics/__init__.py (1)

58-119: Well-structured public API.

The record() function provides a clean entry point with proper validation, enum-to-string conversion, and graceful error handling that returns False on failures rather than propagating exceptions. The design appropriately decouples the public API from backend implementation details.

unstract/core/src/unstract/core/metrics/registry.py (1)

1-60: Clean registry implementation.

The registry provides efficient O(1) validation via set lookup, clear separation between validation and type retrieval, and appropriate error handling with ValueError for unknown metrics.

backend/backend/celery_config.py (2)

36-53: Good production-ready resilience settings.

The broker connection retry configuration with exponential backoff (interval_step, interval_max), socket keepalive, and worker limits (max_tasks_per_child) will help maintain stability under broker disruptions and prevent memory leaks from long-running workers.

68-73: Task names in task_routes are correctly defined and match registered Celery tasks.

The task names (dashboard_metrics.process_events, dashboard_metrics.cleanup_hourly_data, dashboard_metrics.cleanup_daily_data) match their corresponding @shared_task definitions in backend/dashboard_metrics/tasks.py. Routing is correctly configured.

unstract/core/src/unstract/core/metrics/types.py (1)

8-32: Well-structured metric type definitions.

The dual inheritance from str and Enum enables both type safety and string comparisons. The separation of counters (summed values) and histograms (distributions) is clearly documented and follows established metrics patterns.

backend/dashboard_metrics/urls.py (1)

17-33: Clean URL configuration with proper ordering.

The URL patterns are well-organized with specific paths before the UUID capture pattern, preventing routing conflicts. Using format_suffix_patterns enables content negotiation for API responses.

backend/dashboard_metrics/migrations/0004_update_cleanup_periodic_tasks.py (1)

60-65: Reverse migration is correct as written.

Migration 0002 creates the dashboard_metrics_cleanup_hourly task. Migration 0004's forward operation uses update_or_create on that existing task (not creating a new one), adds the daily task, and deletes a legacy task. The reverse should return to the state after 0002, which means retaining the hourly task (still managed by 0002's reverse function) and removing only what 0004 added—the daily task. The current implementation is correct.

Likely an incorrect or invalid review comment.

unstract/core/src/unstract/core/metrics/backends/queue.py (1)

178-198: Celery protocol v2 format is compatible with the project's Celery version (5.3.4+).

The message construction follows the standard kombu protocol v2 format [args, kwargs, embed], which remains compatible with Celery 5.3.4+ (and 5.5.3 in the workers package). The headers correctly specify the task name and ID for Celery task routing. No compatibility concerns.

backend/dashboard_metrics/views.py (1)

402-450: Health checks look sound and side‑effect free.

health, _check_database, and _check_cache do minimal, safe operations (an exists() call and a short‑lived cache set/get) and return structured status with appropriate HTTP 200/503 responses; this is suitable for liveness/readiness probes.

backend/dashboard_metrics/migrations/0003_add_daily_monthly_tables.py (1)

16-260: Schema and indexing for daily/monthly tables look consistent and appropriate.

The daily/monthly models mirror the hourly schema (including labels default, project/tag, organization FK, unique constraints, and GIN/index coverage), which should give predictable semantics and good query performance across all granularities.

backend/dashboard_metrics/services.py (2)

383-447: get_all_metrics_summary aggregates are consistent with individual query methods.

The summary method correctly reuses the per‑metric query functions and normalizes potential NULL sums (e.g., pages and LLM usage) with or 0, producing a clean dict of totals for the live summary endpoint.

230-269: The join logic in get_deployed_api_requests is correct.

WorkflowExecution.pipeline_id is a UUIDField designed to hold IDs of either Pipeline or APIDeployment entities (as evidenced by the conditional logic in backend/workflow_manager/internal_serializers.py). The subquery filtering pipeline_id__in=Subquery(APIDeployment.objects.filter(...).values("id")) correctly extracts deployed API request counts by matching executions where the pipeline_id refers to an API deployment.

backend/dashboard_metrics/migrations/0001_initial.py (1)

17-135: Hourly metrics migration aligns with model definition and indexing strategy.

Field definitions, indexes, and the unique_hourly_metric constraint all match the EventMetricsHourly model description, so the initial schema for hourly metrics looks correct and production‑ready.

backend/dashboard_metrics/serializers.py (1)

11-124: Query and response serializers are consistent with view usage and enforce sane bounds.

MetricsQuerySerializer.validate() correctly defaults the date range, enforces start_date <= end_date, and caps queries at 90 days, while the nested series/summary serializers match the shapes constructed in DashboardMetricsViewSet. The EventMetricsHourlySerializer is read‑only and exposes the expected fields.

backend/dashboard_metrics/cache.py (2)

71-145: Caching decorator behavior and keying strategy look solid.

The decorator scopes cache entries by organization, endpoint name, and a stable hash of sorted query params, and only caches 200 responses. It also cleanly bypasses caching when no organization context is present, which is important for correctness with per‑org data.

148-167: No changes needed – code is correct.

CacheService is a class in backend/utils/cache_service.py, and clear_cache is a static method of that class. The import and call in invalidate_metrics_cache are correct. The implementation will properly invalidate cached metrics for the organization.

Likely an incorrect or invalid review comment.

backend/dashboard_metrics/models.py (1)

57-85: Confirm BaseModel doesn’t define id, and that metric_count’s default is intentional

Two small checks:

If BaseModel already defines an id field, redefining id here as a primary key can cause confusion or conflicts. If BaseModel only adds timestamps and no id, you’re fine; otherwise, consider centralizing the PK definition in one place.

metric_count defaults to 1. That makes sense if every row is always created from at least one event, but if you ever create rows with explicit metric_count or rely on DB-side aggregation, this default could overcount silently.

Please double-check that BaseModel does not define id, and that you never rely on the default metric_count for anything other than “one event just created this row”.

backend/dashboard_metrics/capture.py

backend/dashboard_metrics/models.py

coderabbitai · 2026-01-05T04:40:58Z

backend/dashboard_metrics/tasks.py

+def _bulk_upsert_hourly(aggregations: dict) -> tuple[int, int]:
+    """Bulk upsert hourly aggregations.
+
+    Args:
+        aggregations: Dict of aggregated metric data keyed by
+            (org_id, hour_ts_str, metric_name, project, tag)
+
+    Returns:
+        Tuple of (created_count, updated_count)
+    """
+    created, updated = 0, 0
+    with transaction.atomic():
+        for key, agg in aggregations.items():
+            org_id, hour_ts_str, metric_name, project, tag = key
+            hour_ts = datetime.fromisoformat(hour_ts_str)
+
+            try:
+                obj, was_created = (
+                    EventMetricsHourly.objects.select_for_update().get_or_create(
+                        organization_id=org_id,
+                        timestamp=hour_ts,
+                        metric_name=metric_name,
+                        project=project,
+                        tag=tag,
+                        defaults={
+                            "metric_type": agg["metric_type"],
+                            "metric_value": agg["value"],
+                            "metric_count": agg["count"],
+                            "labels": agg["labels"],
+                        },
+                    )
+                )
+                if was_created:
+                    created += 1
+                else:
+                    obj.metric_value = F("metric_value") + agg["value"]
+                    obj.metric_count = F("metric_count") + agg["count"]
+                    if agg["labels"]:
+                        obj.labels = {**obj.labels, **agg["labels"]}
+                    obj.save(
+                        update_fields=[
+                            "metric_value",
+                            "metric_count",
+                            "labels",
+                            "modified_at",
+                        ]
+                    )
+                    updated += 1
+            except Exception as e:
+                logger.error(f"Error upserting hourly metric {key}: {e}")
+
+    return created, updated


⚠️ Potential issue | 🟠 Major

DB errors are swallowed inside transactions, defeating Celery autoretry and risking silent data loss

In _bulk_upsert_hourly/_daily/_monthly and in the cleanup tasks:

Each function wraps DB work in transaction.atomic() but also catches a broad Exception inside the block and only logs it.

For the Celery tasks (process_dashboard_metric_events, cleanup_hourly_metrics, cleanup_daily_metrics) you’ve configured autoretry_for=(DatabaseError, OperationalError), but those DB exceptions are caught and converted into logs/return dicts, so Celery never sees them and won’t retry.

Catching DB errors inside a single atomic() block without calling transaction.set_rollback(True) or re‑raising can also leave the transaction marked for rollback and cause subsequent operations in the same block to fail, leading to partial or fully rolled-back writes while the task still reports success.

This combination can make metrics silently incomplete under DB issues (locks, timeouts, transient failures) and is at odds with the production-readiness goal.

I’d recommend:

In the upsert helpers: let DatabaseError/OperationalError bubble up so Celery can apply its retry/backoff policy. If you want per-key resilience, move try/except outside the atomic() block or drastically narrow the except clause and re‑raise DB errors.

In the cleanup tasks: don’t catch DatabaseError/OperationalError at all (or catch, log with logger.exception(...), then raise) so autoretry_for actually takes effect. Keep a separate except Exception handler only for truly unexpected non-DB errors if you still want to return a failure dict.

Example pattern for the hourly upsert (same idea applies to daily/monthly and cleanup):

with transaction.atomic(): for key, agg in aggregations.items(): org_id, hour_ts_str, metric_name, project, tag = key hour_ts = datetime.fromisoformat(hour_ts_str) obj, was_created = ( EventMetricsHourly.objects.select_for_update().get_or_create( organization_id=org_id, timestamp=hour_ts, metric_name=metric_name, project=project, tag=tag, defaults={ "metric_type": agg["metric_type"], "metric_value": agg["value"], "metric_count": agg["count"], "labels": agg["labels"], }, ) ) # same update logic, no broad except here

and for cleanup:

try: deleted_count, _ = EventMetricsHourly.objects.filter( timestamp__lt=cutoff ).delete() except (DatabaseError, OperationalError): logger.exception("Error during hourly cleanup") raise except Exception: logger.exception("Unexpected error during hourly cleanup") return {"success": False, "retention_days": retention_days}

This keeps the resilience you want while letting Celery handle transient DB failures correctly.

Also applies to: 263-315, 317-369, 371-414, 424-458

🧰 Tools

🪛 Ruff (0.14.10)

257-257: Do not catch blind exception: Exception

(BLE001)

258-258: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

backend/dashboard_metrics/urls.py

backend/dashboard_metrics/views.py

coderabbitai · 2026-01-05T04:40:58Z

unstract/core/src/unstract/core/metrics/backends/queue.py

+    def _get_connection(self) -> Connection:
+        """Get or create a connection to the broker with retry logic."""
+        if self._connection is not None and self._connection.connected:
+            return self._connection
+        return self._get_connection_with_retry()


⚠️ Potential issue | 🟡 Minor

Connection connected property may return stale state.

The _connection.connected check at line 65 may return True even if the broker has disconnected (e.g., broker restart). Consider wrapping the connection reuse in a try-except that falls back to _get_connection_with_retry() on first publish failure, or use Kombu's ensure_connection() pattern.

🔎 Suggested improvement

def _get_connection(self) -> Connection: """Get or create a connection to the broker with retry logic.""" - if self._connection is not None and self._connection.connected: - return self._connection - return self._get_connection_with_retry() + if self._connection is not None: + try: + # Verify connection is actually usable + self._connection.ensure_connection(max_retries=1) + return self._connection + except Exception: + pass # Fall through to reconnect + return self._get_connection_with_retry()

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def _get_connection(self) -> Connection:

"""Get or create a connection to the broker with retry logic."""

if self._connection is not None and self._connection.connected:

return self._connection

return self._get_connection_with_retry()

def _get_connection(self) -> Connection:

"""Get or create a connection to the broker with retry logic."""

if self._connection is not None:

try:

# Verify connection is actually usable

self._connection.ensure_connection(max_retries=1)

return self._connection

except Exception:

pass # Fall through to reconnect

return self._get_connection_with_retry()

coderabbitai · 2026-01-05T04:40:58Z

unstract/core/src/unstract/core/metrics/backends/queue.py

+    def record(self, event: MetricEvent) -> bool:
+        """Send a metric event as a Celery task.
+
+        Args:
+            event: The MetricEvent to queue
+
+        Returns:
+            True if the event was queued successfully, False otherwise
+        """
+        try:
+            event_data = self._create_event_data(event)
+            task_message = self._create_celery_message(event_data)
+            conn = self._get_connection()
+
+            with conn.Producer() as producer:
+                producer.publish(
+                    task_message,
+                    exchange=self.exchange,
+                    routing_key=self.queue_name,
+                    declare=[self.queue],
+                    delivery_mode=2,  # Persistent
+                    content_type="application/json",
+                    content_encoding="utf-8",
+                    headers={
+                        "task": self.task_name,
+                        "id": str(uuid.uuid4()),
+                    },
+                )
+
+            logger.debug(
+                f"Queued metric task: {event.metric_name} for org {event.org_id}"
+            )
+            return True
+
+        except KombuError as e:
+            logger.error(f"Failed to queue metric {event.metric_name}: {e}")
+            return False
+        except Exception as e:
+            logger.error(f"Unexpected error queuing metric: {e}")
+            return False


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

Kombu Connection thread-safe publishing messages concurrent

💡 Result:

Short answer: Kombu (and the underlying py-amqp transport) is not safe for concurrent publishing from multiple threads using the same Connection/Channel objects. Share a Connection only if you ensure each thread uses its own Channel (or a thread-safe wrapper/pool) or protect access with a lock; better: give each thread its own Connection/Channel or use a connection pool or a thread-safe transport implementation. [1][2][3][4]

Recommendations

Do not share Channel objects across threads — channels are channel-bound and can interleave AMQP frames if used concurrently. Use one channel per thread. [3][4]

Prefer one producer Connection/Channel per thread, or use a Connection/Channel pool to reuse resources safely. [1][4]

If you must share a Connection, serialize all operations on it with a threading.Lock (but this reduces concurrency). [3]

Consider using a project that provides a thread-safe kombu transport/wrapper (e.g., kombu-pyamqp-threadsafe) if you need shared connections with multithreading. [1]

Sources

kombu-pyamqp-threadsafe project (thread-safe wrapper for kombu/pyamqp). [1]

StackOverflow discussion on publishing from consumer/threads with Kombu (issues when mixing consuming/publishing on same thread). [2]

Kombu/py-amqp mailing-list / discussion about thread-safety (channels bound to threads; beware interleaved writes). [3]

rabbitpy documentation on multi-threaded use (recommend one channel per thread). [4]

🏁 Script executed:

# Find and examine the _get_connection method and class initialization cd unstract/core/src/unstract/core/metrics/backends && grep -n "_get_connection\|__init__\|_connection" queue.py | head -20

Repository: Zipstack/unstract

Length of output: 828

🏁 Script executed:

cd unstract/core/src/unstract/core/metrics/backends && sed -n '35,110p' queue.py

Repository: Zipstack/unstract

Length of output: 2738

Kombu Connection is not thread-safe—add locking or use connection pooling.

The code caches self._connection as an instance variable without synchronization. Concurrent calls to record() from multiple threads will race on self._connection state (the check on line 65 is not atomic), and Kombu does not support concurrent publishing from the same Connection. For multi-threaded metrics recording, protect the connection with a threading.Lock, use thread-local connections, or implement a connection pool.

🧰 Tools

🪛 Ruff (0.14.10)

149-149: Consider moving this statement to an else block

(TRY300)

152-152: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

154-154: Do not catch blind exception: Exception

(BLE001)

155-155: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

coderabbitai · 2026-01-05T04:40:59Z

unstract/core/src/unstract/core/metrics/types.py

+# Mapping of metric names to their types
+METRIC_TYPE_MAP: dict[str, MetricType] = {
+    # Counters
+    MetricName.DOCUMENTS_PROCESSED.value: MetricType.COUNTER,
+    MetricName.PAGES_PROCESSED.value: MetricType.COUNTER,
+    MetricName.PROMPT_EXECUTIONS.value: MetricType.COUNTER,
+    MetricName.LLM_CALLS.value: MetricType.COUNTER,
+    MetricName.CHALLENGES.value: MetricType.COUNTER,
+    MetricName.SUMMARIZATION_CALLS.value: MetricType.COUNTER,
+    # Histograms
+    MetricName.DEPLOYED_API_REQUESTS.value: MetricType.HISTOGRAM,
+    MetricName.ETL_PIPELINE_EXECUTIONS.value: MetricType.HISTOGRAM,
+    MetricName.LLM_USAGE.value: MetricType.HISTOGRAM,
+}


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check if tests exist to verify METRIC_TYPE_MAP completeness rg -n 'METRIC_TYPE_MAP' --type=py -g '**/test*'

Repository: Zipstack/unstract

Length of output: 43

🏁 Script executed:

#!/bin/bash # Find MetricName enum definition rg -n 'class MetricName|enum MetricName' --type=py -A 20

Repository: Zipstack/unstract

Length of output: 1805

🏁 Script executed:

#!/bin/bash # Find get_metric_type() implementation rg -n 'def get_metric_type' --type=py -A 10

Repository: Zipstack/unstract

Length of output: 933

🏁 Script executed:

#!/bin/bash # Check the full types.py file to understand the structure cat -n unstract/core/src/unstract/core/metrics/types.py

Repository: Zipstack/unstract

Length of output: 3933

🏁 Script executed:

#!/bin/bash # Get the full get_metric_type() implementation cat -n unstract/core/src/unstract/core/metrics/registry.py | head -50

Repository: Zipstack/unstract

Length of output: 1566

🏁 Script executed:

#!/bin/bash # Search for test files in metrics module fd '.*test.*\.py$' --type f | grep -i metric

Repository: Zipstack/unstract

Length of output: 106

🏁 Script executed:

#!/bin/bash # Search for any validation or assertions related to METRIC_TYPE_MAP rg -n 'assert.*METRIC_TYPE_MAP|METRIC_TYPE_MAP.*in|in.*METRIC_TYPE_MAP' --type=py

Repository: Zipstack/unstract

Length of output: 200

METRIC_TYPE_MAP must stay in sync with MetricName.

While validate_metric() checks if a metric exists in the enum, it does not ensure the metric is in METRIC_TYPE_MAP. If a new metric is added to MetricName without updating METRIC_TYPE_MAP, get_metric_type() will raise a KeyError when accessing the map on line 39 of registry.py. Add a unit test to verify all MetricName values have corresponding entries in METRIC_TYPE_MAP.

🤖 Prompt for AI Agents

In @unstract/core/src/unstract/core/metrics/types.py around lines 35-48, Add a unit test that ensures every value of the MetricName enum has a corresponding key in METRIC_TYPE_MAP to prevent KeyError in get_metric_type(); specifically, iterate over MetricName (or MetricName.__members__.values()/MetricName) and assert that each .value is present in METRIC_TYPE_MAP, referencing METRIC_TYPE_MAP and MetricName; this complements validate_metric() and prevents missing entries that would cause get_metric_type() to fail.

chandrasekharan-zipstack

Changes mostly LGTM. Left some minor comments.

One main question I have is around cache invalidation of the cached, aggregated data - does it happen only on the respective TTLs? I hope possibly receiving stale data for that duration is okay in this usecase

chandrasekharan-zipstack · 2026-01-05T06:34:49Z

backend/backend/celery_config.py

+    worker_prefetch_multiplier = 4
+    worker_max_tasks_per_child = 1000


@athul-rs any reason why this or other config was added? I hope it doesn't have any side effects on other workers

chandrasekharan-zipstack · 2026-01-05T07:01:41Z

backend/dashboard_metrics/tests/test_tasks.py

@athul-rs do these tests run on every PR as part of the existing unit tests? If not, please ensure that its integrated there as well otherwise your effort on these tests won't be realised

chandrasekharan-zipstack · 2026-01-05T07:26:40Z

backend/dashboard_metrics/cache.py

+CACHE_TTL_CURRENT_HOUR = 30  # 30 seconds for current hour (updating frequently)
+CACHE_TTL_HISTORICAL = 8 * 60 * 60  # 8 hours for historical data (stable)
+
+# Legacy TTLs for compatibility with existing endpoints


@athul-rs what existing endpoints do we have that require such legacy TTLs? Check and correct this comment

chandrasekharan-zipstack · 2026-01-05T07:28:23Z

backend/dashboard_metrics/cache.py

+
+# Cache TTL values (in seconds)
+# Time-aware TTLs per documentation
+CACHE_TTL_CURRENT_HOUR = 30  # 30 seconds for current hour (updating frequently)


NIT: @athul-rs I would suggest defining these values in settings/base.py so that they can be configured with an env too optionally with some sane defaults. Make sure the names are updated / prefixed with DASHBOARD_ to group them separately

chandrasekharan-zipstack · 2026-01-05T07:32:28Z

backend/dashboard_metrics/cache.py

+    """
+    # Sort params for consistent hashing
+    sorted_params = json.dumps(params, sort_keys=True, default=str)
+    params_hash = hashlib.md5(sorted_params.encode()).hexdigest()[:12]


NIT: Possibility for cache collision since we only take first 12 digits, but should suffice mostly since we separate for endpoints. Do check if a change is needed here

chandrasekharan-zipstack · 2026-01-05T07:50:23Z

backend/dashboard_metrics/tasks.py

+        except KeyError as e:
+            logger.warning(f"Skipping event with missing required field: {e}")
+            errors += 1
+        except Exception as e:
+            logger.warning(f"Error processing event: {e}")


NIT: Consider adding a stack trace

chandrasekharan-zipstack · 2026-01-05T07:56:59Z

backend/dashboard_metrics/views.py

+            organization = UserContext.get_organization()
+            if not organization:
+                logger.warning("No organization context for metrics request")
+                raise PermissionDenied("No organization context")
+            return EventMetricsHourly.objects.filter(organization=organization)


@athul-rs check if this is required. This would be handled by the model manager itself usually

chandrasekharan-zipstack · 2026-01-05T07:59:01Z

backend/dashboard_metrics/cache.py

NIT: This is a cool idea and if refactored a bit to make it generic, we can easily extend the same to any other API / DB call that we wish to cache. Consider this suggestion if its easy to make it generic

chandrasekharan-zipstack · 2026-01-05T07:59:41Z

unstract/core/src/unstract/core/metrics/backends/noop.py

+from ..types import MetricEvent
+from .base import AbstractMetricBackend


NIT: Consider absolute imports instead

chandrasekharan-zipstack · 2026-01-05T08:01:01Z

unstract/core/src/unstract/core/metrics/backends/queue.py

+            return self._connection
+        return self._get_connection_with_retry()
+
+    def _get_connection_with_retry(


@athul-rs why do we need to implement this? Isn't it available / handled within celery / kombu?

github-actions · 2026-01-13T18:12:05Z

Test Results

Summary

✅ Runner Tests: 11 passed, 0 failed (11 total)
✅ SDK1 Tests: 66 passed, 0 failed (66 total)

Runner Tests - Full Report

filepath	function	$$\textcolor{#23d18b}{\tt{passed}}$$	SUBTOTAL
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_logs}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_cleanup}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_cleanup\_skip}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_client\_init}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_image\_exists}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_image}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_container\_run\_config}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_container\_run\_config\_without\_mount}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_run\_container}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_image\_for\_sidecar}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_sidecar\_container}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{TOTAL}}$$		$$\textcolor{#23d18b}{\tt{11}}$$	$$\textcolor{#23d18b}{\tt{11}}$$

SDK1 Tests - Full Report

coderabbitai

Actionable comments posted: 6

Note

Due to the large number of review comments, Critical severity comments were prioritized as inline comments.

🤖 Fix all issues with AI agents

In @backend/dashboard_metrics/management/commands/backfill_metrics.py:
- Around line 269-298: The outer transaction.atomic() in _upsert_hourly wraps
all update_or_create calls so a DB error (e.g., IntegrityError) can leave the
transaction unusable and abort the rest of the loop; remove the outer
transaction.atomic() (or alternatively use a per-row savepoint) so each
EventMetricsHourly._base_manager.update_or_create call runs in its own atomic
context and the loop can continue on exceptions; also change the unused returned
variable name obj to _ and apply the same change (remove outer atomic or use
per-row savepoints and rename obj → _) in the corresponding daily/monthly upsert
methods that use update_or_create.

In @backend/dashboard_metrics/management/commands/generate_metrics_test_data.py:
- Around line 235-283: The re-query after bulk_create in
_generate_workflow_executions is unreliable and may misalign _target_created_at
values; instead use the list returned by
WorkflowExecution.objects.bulk_create(executions) (these objects have PKs
populated) and iterate that returned list to update each record's created_at via
individual QuerySet.update(id=exec_obj.id) using exec_obj._target_created_at (do
not attempt bulk_update because created_at uses auto_now_add).

In @backend/dashboard_metrics/tasks.py:
- Around line 222-262: The current broad except in the upsert loop swallows
IntegrityError from concurrent inserts causing lost increments; import
django.db.IntegrityError and explicitly catch IntegrityError around the
get_or_create in EventMetricsHourly upsert logic (the block that calls
select_for_update().get_or_create and currently sets obj, was_created), then on
IntegrityError re-query the row (EventMetricsHourly.objects.get(...) for the
same organization_id, timestamp, metric_name, project, tag), set
was_created=False and proceed to update counters and labels as in the existing
else-path; keep the generic Exception handler for other errors but ensure you
continue after logging; apply the same explicit IntegrityError catch-and-retry
pattern to the _bulk_upsert_daily and _bulk_upsert_monthly blocks as well.

In @unstract/core/src/unstract/core/metrics/backends/queue.py:
- Around line 35-62: The QueueBackend __init__ creates self.exchange =
Exchange("", type="direct") which uses the AMQP default exchange and doesn't
match the implicit queue-named exchange used by Celery; change the exchange to a
named direct exchange that matches the queue name (e.g., use
Exchange(self.queue_name, type="direct") or Exchange("dashboard_metric_events",
type="direct") when constructing self.exchange) so published messages route to
the KombuQueue (self.queue) consumers.

🟠 Major comments (19)

backend/dashboard_metrics/management/commands/populate_test_metrics.py-22-33 (1)
22-33: Metric types appear inconsistent with documentation.

The METRICS list defines:

pages_processed as COUNTER (line 25)

deployed_api_requests as HISTOGRAM (line 30)

etl_pipeline_executions as HISTOGRAM (line 31)

However, the README documentation (lines 173-179) indicates:

pages_processed should be HISTOGRAM (sums page counts)

deployed_api_requests and etl_pipeline_executions should be COUNTER (counting events)

This mismatch could cause incorrect test data generation and potentially mask issues during testing.
🐛 Proposed fix
     METRICS = [
         ("documents_processed", MetricType.COUNTER),
-        ("pages_processed", MetricType.COUNTER),
+        ("pages_processed", MetricType.HISTOGRAM),
         ("prompt_executions", MetricType.COUNTER),
         ("llm_calls", MetricType.COUNTER),
         ("challenges", MetricType.COUNTER),
         ("summarization_calls", MetricType.COUNTER),
-        ("deployed_api_requests", MetricType.HISTOGRAM),
-        ("etl_pipeline_executions", MetricType.HISTOGRAM),
+        ("deployed_api_requests", MetricType.COUNTER),
+        ("etl_pipeline_executions", MetricType.COUNTER),
         ("llm_usage", MetricType.HISTOGRAM),
     ]
frontend/src/hooks/useMetricsData.js-21-75 (1)

21-75: Fix hook dependency handling: include callbacks in effect dependency arrays instead of using eslint-disable-line.

The eslint-disable-line comments hide a real issue: useMetricsOverview, useMetricsSummary, and useMetricsSeries all omit their fetch callbacks from the effect dependencies (lines 141, 227, 313), even though these callbacks are in the useCallback dependencies. If axiosPrivate changes (e.g., during auth token refresh), the callbacks are recreated, but the effects won't re-run—so they'll continue calling stale callbacks with outdated axiosPrivate instances.

Additionally, useMetricsData unnecessarily depends on the entire options object instead of destructuring individual properties, causing callback recreation whenever a new options object is passed, even if its values haven't changed.

Proposed fixes:

For useMetricsData (lines 21-75), destructure options and depend on individual properties:

Add at the top: const { startDate, endDate, granularity, metricName } = options;

Change useCallback deps from [axiosPrivate, options, orgId] to [axiosPrivate, orgId, startDate, endDate, granularity, metricName]

Update references from options.startDate → startDate etc.

For the three other hooks (lines 137-145, 223-231, 309-317), include the fetch callback in effect dependencies:

useMetricsOverview: }, [fetchOverview]); (removes the guard; callback has its own internal guard)

useMetricsSummary: }, [fetchSummary]);

useMetricsSeries: }, [fetchSeries]);

backend/dashboard_metrics/migrations/0001_initial.py-17-136 (1)

17-136: Make organization non-nullable in all three metrics tables.

PostgreSQL UNIQUE constraints allow multiple NULL values (NULL ≠ NULL in SQL), so with organization=null, the constraint cannot prevent duplicate rows for the same (bucket, metric_name, project, tag). All actual metric creation explicitly passes organization_id (never None), and the manager filters exclude NULL organizations. Since this is a multi-tenant application and metrics are always org-scoped, organization should be null=False with no default=None to ensure constraint integrity and prevent orphaned records.

Also applies to: 137–258 (EventMetricsDaily), 259–380 (EventMetricsMonthly)
backend/backend/celery_config.py-36-49 (1)
36-49: Fix invalid socket options in broker_transport_options — socket_timeout and socket_keepalive are not recognized by Kombu/py-amqp.

In Celery 5.x with RabbitMQ (py-amqp), the socket timeout and keepalive options in broker_transport_options are misconfigured:

socket_timeout is not a valid transport option; use read_timeout and write_timeout instead.

socket_keepalive is not a valid transport option; py-amqp enables SO_KEEPALIVE by default. If you need to tune keepalive behavior, use the socket_settings dict with platform-specific constants.

These invalid keys will be silently ignored, leaving sockets without timeout protection. The other configuration keys (broker_connection_retry_on_startup, broker_connection_retry, broker_connection_max_retries, and task_routes) are correctly formatted for Celery 5.x.
Recommended fix

Replace the current broker_transport_options with:
broker_transport_options = {
    "max_retries": 3,
    "interval_start": 0,
    "interval_step": 0.2,
    "interval_max": 1,
    "read_timeout": 30,
    "write_timeout": 30,
}
Or, if you need platform-specific keepalive tuning:
from amqp.platform import TCP_USER_TIMEOUT

broker_transport_options = {
    "max_retries": 3,
    "interval_start": 0,
    "interval_step": 0.2,
    "interval_max": 1,
    "read_timeout": 30,
    "write_timeout": 30,
    "socket_settings": {
        TCP_USER_TIMEOUT: 60000,  # ms
    },
}
frontend/src/components/metrics-dashboard/MetricsDashboard.jsx-55-95 (1)
55-95: Normalize date range boundaries to prevent timezone-induced off-by-one-day queries.

Converting local dates directly to UTC ISO strings can shift day boundaries by hours. When a user picks "Jan 30" in EST (UTC-5), .toISOString() produces 2024-01-30T05:00:00Z, causing the query to start mid-day and miss early morning metrics. For hourly aggregations this causes timestamp misalignment; for daily/monthly it silently excludes or includes unintended days.

Explicitly set .startOf("day") and .endOf("day") before converting to ISO to ensure the backend receives the full day range in UTC:
Proposed fix (normalize dates before sending to hooks)
@@
   const [dateRange, setDateRange] = useState([
-    dayjs().subtract(30, "day"),
-    dayjs(),
+    dayjs().subtract(30, "day").startOf("day"),
+    dayjs().endOf("day"),
   ]);
@@
-  } = useMetricsOverview(
-    dateRange[0]?.toISOString(),
-    dateRange[1]?.toISOString()
-  );
+  const startIso = dateRange[0]?.startOf("day").toISOString();
+  const endIso = dateRange[1]?.endOf("day").toISOString();
+
+  } = useMetricsOverview(startIso, endIso);
@@
   } = useMetricsSummary(
-    shouldFetchSummary ? dateRange[0]?.toISOString() : null,
-    shouldFetchSummary ? dateRange[1]?.toISOString() : null,
+    shouldFetchSummary ? startIso : null,
+    shouldFetchSummary ? endIso : null,
     null, // metricName
     dataSource
   );
@@
   const handleDateChange = useCallback((dates) => {
     if (dates && dates.length === 2) {
-      setDateRange(dates);
+      setDateRange([dates[0].startOf("day"), dates[1].endOf("day")]);
     }
   }, []);
backend/dashboard_metrics/tasks.py-526-716 (1)

526-716: aggregate_metrics_from_sources() writes metric_count as “number of buckets”, not “number of events” (breaks averages/derived stats).
You do count += 1 per returned period row (Lines 608-609 / 633-634 / 649-652), so metric_count becomes “hours with data” / “days with data”, not the underlying event count. This will conflict with event-based ingestion where metric_count is the number of recorded events.

At minimum, define/rename the semantic (bucket_count vs event_count) or adjust MetricsQueryService queries to also return an event count (e.g., Count("id")) so metric_count is consistent.
backend/dashboard_metrics/views.py-786-797 (1)
786-797: live_series error entries don’t match the success schema (frontend may crash on missing keys).
Success entries include metric_type, data, total_value; error entries omit metric_type and also omit project/tag/total_count fields that the aggregated endpoints provide.
Proposed fix (keep shape stable)
                 series.append(
                     {
                         "metric_name": metric_name,
+                        "metric_type": "histogram"
+                        if metric_name == "llm_usage"
+                        else "counter",
                         "error": "unavailable",
                         "data": [],
                         "total_value": 0,
+                        "total_count": 0,
                     }
                 )
backend/dashboard_metrics/views.py-277-310 (1)
277-310: average_value should use total_count (events), not len(values) (buckets).
metric_value is aggregated and metric_count is the number of events; dividing by bucket count will skew averages (especially if there are sparse buckets).
Proposed fix
                 summary_list.append(
                     {
@@
-                        "average_value": agg["total_value"] / len(values)
-                        if values
-                        else 0,
+                        "average_value": (
+                            agg["total_value"] / agg["total_count"]
+                            if agg["total_count"]
+                            else 0
+                        ),
                         "min_value": min(values) if values else 0,
                         "max_value": max(values) if values else 0,
                     }
                 )
backend/dashboard_metrics/views.py-129-152 (1)
129-152: Monthly filtering is wrong when start_date is mid-month (you’ll drop the starting month).
month rows are typically stored as the first day of the month; filtering month__gte=start_date.date() will exclude that month unless start_date is on day 1.
Proposed fix
         queryset = queryset.filter(
             **{
-                f"{ts_field}__gte": params["start_date"]
-                if source == "hourly"
-                else params["start_date"].date(),
+                f"{ts_field}__gte": (
+                    params["start_date"]
+                    if source == "hourly"
+                    else (
+                        params["start_date"].date().replace(day=1)
+                        if source == "monthly"
+                        else params["start_date"].date()
+                    )
+                ),
                 f"{ts_field}__lte": params["end_date"]
                 if source == "hourly"
                 else params["end_date"].date(),
             }
         )
backend/dashboard_metrics/services.py-1-35 (1)
1-35: Docstring says “Uses _base_manager for models…”, but only Usage is bypassed (Celery aggregation will likely return zeros).
APIDeployment, Pipeline, and often WorkflowExecution/PageUsage also use organization-aware managers; in Celery context UserContext is absent, so .objects may auto-filter to empty.
#!/bin/bash
# Verify which models use DefaultOrganizationManagerMixin-like filtering.
rg -n --type=py "DefaultOrganizationManagerMixin|DefaultOrganizationMixin|UserContext\.get_organization\(" backend | head -n 200
rg -n --type=py "objects\s*=\s*.*ModelManager|objects\s*=" backend/api_v2/models.py backend/pipeline_v2/models.py backend/workflow_manager/workflow_v2/models/execution.py backend/account_usage/models.py 2>/dev/null || true
backend/dashboard_metrics/tasks.py-79-209 (1)
79-209: Normalize metric_type (string vs enum) and protect against non-numeric metric_value.
QueueBackend sends metric_type as a string; here default uses MetricType.COUNTER (enum). Mixed types can leak into DB writes.
Proposed fix
-            metric_value = event.get("metric_value", 1)
-            metric_type = event.get("metric_type", MetricType.COUNTER)
+            metric_value = event.get("metric_value", 1)
+            try:
+                metric_value = float(metric_value)
+            except (TypeError, ValueError):
+                raise ValueError(f"Invalid metric_value: {metric_value!r}")
+
+            metric_type_raw = event.get("metric_type", MetricType.COUNTER)
+            try:
+                metric_type = MetricType(metric_type_raw)
+            except Exception:
+                metric_type = MetricType.COUNTER
backend/dashboard_metrics/services.py-67-399 (1)
67-399: Use a consistent “unfiltered manager” helper for all source models used by Celery tasks.
Otherwise aggregate_metrics_from_sources() will compute incorrect results for at least deployed_api_requests / etl_pipeline_executions (and possibly others).
Suggested pattern
+def _all(model):
+    # Prefer _base_manager to bypass org-context managers in Celery.
+    return getattr(model, "_base_manager", model.objects).all()
@@
-            WorkflowFileExecution.objects.filter(
+            _all(WorkflowFileExecution).filter(
@@
-            PageUsage.objects.filter(
+            _all(PageUsage).filter(
@@
-        api_subquery = APIDeployment.objects.filter(
+        api_subquery = _all(APIDeployment).filter(
             organization_id=organization_id
         ).values("id")
@@
-            WorkflowExecution.objects.filter(
+            _all(WorkflowExecution).filter(
@@
-        pipeline_subquery = Pipeline.objects.filter(
+        pipeline_subquery = _all(Pipeline).filter(
             organization_id=organization_id,
             pipeline_type="ETL",
         ).values("id")
@@
-            WorkflowExecution.objects.filter(
+            _all(WorkflowExecution).filter(
@@
-            WorkflowExecution.objects.filter(
+            _all(WorkflowExecution).filter(
backend/dashboard_metrics/management/commands/backfill_metrics.py-278-280 (1)
278-280: Be consistent with _base_manager vs .objects when bypassing org-context filters.
Hourly uses _base_manager, daily/monthly use .objects. If these models have a default manager that applies org scoping (as noted in PR context), daily/monthly backfill could be incomplete.
Proposed fix
-                    obj, was_created = EventMetricsDaily.objects.update_or_create(
+                    _, was_created = EventMetricsDaily._base_manager.update_or_create(
@@
-                    obj, was_created = EventMetricsMonthly.objects.update_or_create(
+                    _, was_created = EventMetricsMonthly._base_manager.update_or_create(
Also applies to: 309-311, 340-343
backend/dashboard_metrics/cache.py-163-179 (1)
163-179: Cache invalidation does not clear bucket caches for the org.
invalidate_metrics_cache() deletes dashboard_metrics:{org_id}:*, but bucket keys are dashboard_metrics:bucket:<granularity>:<org_id>:..., so they survive invalidation.
Proposed fix (clear both namespaces)
 def invalidate_metrics_cache(org_id: str) -> int:
@@
-    pattern = f"{CACHE_PREFIX}:{org_id}:*"
+    patterns = [
+        f"{CACHE_PREFIX}:{org_id}:*",
+        f"{CACHE_PREFIX}:bucket:*:{org_id}:*",
+    ]
     try:
-        CacheService.clear_cache(pattern)
+        for pattern in patterns:
+            CacheService.clear_cache(pattern)
         logger.info(f"Invalidated metrics cache for org {org_id}")
         return 1  # Pattern-based delete doesn't return count
Also applies to: 333-364
backend/dashboard_metrics/capture.py-79-82 (1)
79-82: Don’t mutate caller-provided labels dict.
metric_labels = labels or {} + adding keys mutates the caller’s dict when labels is passed in.
Proposed fix
-        metric_labels = labels or {}
+        metric_labels = {**(labels or {})}
Apply similarly in all methods that enrich labels.
Also applies to: 116-119, 155-158, 208-213, 260-263, 328-331
backend/dashboard_metrics/capture.py-33-35 (1)
33-35: Avoid import-time settings capture for METRICS_ENABLED (stale toggle in tests/runtime).
METRICS_ENABLED is fixed at import time, so changing settings.DASHBOARD_METRICS_ENABLED during tests or runtime won’t take effect.
Proposed fix
-# Flag to control whether metrics are captured (can be disabled for testing)
-METRICS_ENABLED = getattr(settings, "DASHBOARD_METRICS_ENABLED", True)
+def _metrics_enabled() -> bool:
+    # Read at call time so tests/runtime overrides work.
+    return getattr(settings, "DASHBOARD_METRICS_ENABLED", True)
Then replace if not METRICS_ENABLED: with if not _metrics_enabled(): in all methods.
backend/dashboard_metrics/management/commands/backfill_metrics.py-35-49 (1)
35-49: Use ClassVar + ensure metric_name values match the rest of the metrics system.
Two concerns:

METRIC_CONFIGS is a mutable class attribute (RUF012).

Using raw strings ("documents_processed", etc.) risks drift vs canonical MetricName identifiers used by capture/queue processing.
Proposed fix
+from typing import ClassVar
@@
-    METRIC_CONFIGS = [
+    METRIC_CONFIGS: ClassVar[list[tuple[str, Any, bool]]] = [
         ("documents_processed", MetricsQueryService.get_documents_processed, False),
Separately, consider importing the canonical metric name enum/constants and using them here (if available).
backend/dashboard_metrics/management/commands/backfill_metrics.py-58-63 (1)
58-63: --org-id uses wrong field: should query organization_id, not id.

The help text says "organization ID" and the parameter accepts a string (line 60), but the code queries the integer id field (line 105). Throughout the codebase, the standard pattern is to use Organization.objects.get(organization_id=org_id) for string identifiers. If a caller passes the organization's string identifier (the organization_id field), this will silently fail with Organization.DoesNotExist.

Change line 105 to:
orgs = [Organization.objects.get(organization_id=org_id)]
This aligns with the consistent pattern used in internal_api_views.py, user_context.py, and the rate limit management commands.
backend/dashboard_metrics/management/commands/backfill_metrics.py-194-257 (1)

194-257: metric_count currently counts periods (rows returned), not events or samples, which breaks histogram semantics and is meaningless for counters.

Given that MetricsQueryService.* returns pre-aggregated data (one row per period with value already aggregated), count += 1 makes metric_count represent the number of time periods rather than the number of events or samples. For example:

documents_processed: Returns 1 row per hour with value=Count(id) of 100 documents, but metric_count=1 (the row count)

pages_processed (histogram): Returns 1 row per hour with value=Sum(pages_processed) of 500 pages from 10 source records, but metric_count=1 instead of 10

To fix this, query methods need to return sample/event count information, or the backfill code should be restructured to count source records directly. For counters, metric_count could equal value (event count already in the value); for histograms, it should reflect the actual number of aggregated records.

🟡 Minor comments (13)

backend/dashboard_metrics/README.md-729-733 (1)

729-733: Migration file names may be out of sync.

The file structure section lists migrations as 0002_setup_cleanup_tasks.py and 0003_setup_aggregation_task.py, but the PR description indicates different migration names (0002_setup_cleanup_periodic_task, 0003_add_daily_monthly_tables, 0004_update_cleanup_periodic_tasks). Please update the documentation to match the actual migration filenames.

backend/dashboard_metrics/management/commands/generate_metrics_test_data.py-397-425 (1)

397-425: PageUsage.created_at backfill ignores the originally chosen timestamps.
You compute created_at per record, but then overwrite with a new random timestamp during the update loop. Store _target_created_at like other generators and update from that.

frontend/src/hooks/useMetricsData.js-77-84 (1)

77-84: Docs don’t match behavior for default date range.
JSDoc says overview defaults to “7 days ago → now”, but the hook doesn’t implement those defaults (it passes null through). Either implement defaults or fix the docs to avoid caller confusion.

Also applies to: 85-92

frontend/src/components/metrics-dashboard/MetricsChart.jsx-74-87 (1)

74-87: Handle “no metrics selected” when allowClear is used.
With allowClear, selectedMetrics can become [], and the chart renders with no series (looks broken). Consider either preventing empty selection, auto-falling back to the first available metric, or rendering an <Empty /> state when metricsToShow.length === 0.

Also applies to: 118-258
frontend/src/components/metrics-dashboard/MetricsTable.jsx-64-77 (1)
64-77: Handle null values in min/max render functions.

The check value !== undefined passes for null, but null.toFixed(2) throws a TypeError. The API may return null for metrics without min/max values.
🐛 Proposed fix
       {
         title: "Min",
         dataIndex: "min_value",
         key: "min_value",
-        render: (value) => (value !== undefined ? value.toFixed(2) : "-"),
+        render: (value) => (value != null ? value.toFixed(2) : "-"),
         align: "right",
       },
       {
         title: "Max",
         dataIndex: "max_value",
         key: "max_value",
-        render: (value) => (value !== undefined ? value.toFixed(2) : "-"),
+        render: (value) => (value != null ? value.toFixed(2) : "-"),
         align: "right",
       },
frontend/src/components/metrics-dashboard/MetricsSummary.jsx-13-118 (1)

13-118: Potential React key collision if totals aren’t unique by metric_name.
If the API ever returns multiple rows for the same metric_name (e.g., per project/tag), cards will overwrite each other due to key={metric.metric_name}. Consider keying by a stable composite (or fallback to index).
unstract/core/src/unstract/core/metrics/__init__.py-94-101 (1)
94-101: Use parameterized logging for the unknown-metric warning.
Avoid f-strings in logger calls.
Proposed fix
@@
     if not validate_metric(metric_name):
-        logger.warning(f"Unknown metric: {metric_name}. Skipping.")
+        logger.warning("Unknown metric: %s. Skipping.", metric_name)
         return False
backend/backend/celery_config.py-34-74 (1)
34-74: Add ClassVar annotations for mutable CeleryConfig attributes (Ruff RUF012).
This avoids accidental instance-level mutation patterns and satisfies the linter.
Proposed fix
@@
 from urllib.parse import quote_plus
 
 from django.conf import settings
 from kombu import Queue
+from typing import ClassVar, Any
 
 
 class CeleryConfig:
@@
-    broker_transport_options = {
+    broker_transport_options: ClassVar[dict[str, Any]] = {
         "max_retries": 3,
         "interval_start": 0,
         "interval_step": 0.2,
         "interval_max": 1,
         "socket_timeout": 30,
         "socket_keepalive": True,
     }
@@
-    task_queues = [
+    task_queues: ClassVar[list[Queue]] = [
         Queue("celery", routing_key="celery"),
         Queue("dashboard_metric_events", routing_key="dashboard_metric_events"),
         Queue("celery_periodic_logs", routing_key="celery_periodic_logs"),
         Queue("celery_log_task_queue", routing_key="celery_log_task_queue"),
         Queue("celery_api_deployments", routing_key="celery_api_deployments"),
         Queue("file_processing", routing_key="file_processing"),
         Queue("api_file_processing", routing_key="api_file_processing"),
         Queue("file_processing_callback", routing_key="file_processing_callback"),
         Queue("api_file_processing_callback", routing_key="api_file_processing_callback"),
     ]
@@
-    task_routes = {
+    task_routes: ClassVar[dict[str, dict[str, str]]] = {
         "dashboard_metrics.process_events": {"queue": "dashboard_metric_events"},
         "dashboard_metrics.aggregate_from_sources": {"queue": "dashboard_metric_events"},
         "dashboard_metrics.cleanup_hourly_data": {"queue": "dashboard_metric_events"},
         "dashboard_metrics.cleanup_daily_data": {"queue": "dashboard_metric_events"},
     }
backend/dashboard_metrics/tasks.py-553-567 (1)
553-567: pages_processed is marked histogram, but it’s a counter-like metric (sum of pages).
Unless you intentionally treat it as histogram in UI/rollups, set it to False here.
Proposed fix
-        ("pages_processed", MetricsQueryService.get_pages_processed, True),
+        ("pages_processed", MetricsQueryService.get_pages_processed, False),
backend/dashboard_metrics/cache.py-185-196 (1)
185-196: Type mismatch: set_cached_overview expects dict but warm_cache() passes a list.
Either change the annotation to list[dict[str, Any]] or change what’s stored.
Proposed fix (make it a list everywhere)
 class MetricsCacheService:
@@
-    def get_cached_overview(org_id: str) -> dict[str, Any] | None:
+    def get_cached_overview(org_id: str) -> list[dict[str, Any]] | None:
@@
-    def set_cached_overview(org_id: str, data: dict[str, Any]) -> None:
+    def set_cached_overview(org_id: str, data: list[dict[str, Any]]) -> None:
Also applies to: 231-256
backend/dashboard_metrics/capture.py-215-233 (1)
215-233: Return value should reflect both LLM_CALLS and LLM_USAGE writes (partial success hidden).
If cost > 0 and usage write fails, the method still returns call_result.
Proposed fix
-        if cost > 0:
-            record(
+        usage_result = True
+        if cost > 0:
+            usage_result = record(
                 org_id=org_id,
                 metric_name=MetricName.LLM_USAGE,
                 metric_value=cost,
                 labels=metric_labels,
                 project=project,
             )
-
-        return call_result
+        return call_result and usage_result
backend/dashboard_metrics/capture.py-159-177 (1)
159-177: Validate pages to avoid negative/zero page metrics.
As-is, pages can be 0/negative and will write incorrect histogram/counter values.
Proposed fix
     def record_document_processed(
         org_id: str,
         pages: int = 1,
@@
         if record is None:
             return False
+        if pages < 0:
+            return False
backend/dashboard_metrics/cache.py-142-157 (1)

142-157: Cached Response from serialized data is fragile for future customization.

The current pattern caches response.data (a dict) and reconstructs it as Response(cached) on cache hits. This works fine for plain JSON endpoints, and the four decorated endpoints here (overview, live_summary, live_series, live-summary) currently set no custom headers or renderers, so there's no immediate impact.

However, this pattern is brittle: if custom headers or non-default renderers are added to a cached endpoint in the future, they would silently be lost on cache hits without warning. Consider documenting that this decorator is only suitable for endpoints that return unadorned JSON, or cache/return the full response object with headers intact.

🧹 Nitpick comments (16)

backend/dashboard_metrics/README.md (1)

15-23: Consider adding language hints to code blocks.

Several fenced code blocks in the documentation (ASCII diagrams, data flow, file structure) lack language specifiers. While these are primarily visual diagrams, adding text or plaintext as the language identifier would satisfy markdown linters and improve consistency.
frontend/src/helpers/metricsCache.js (1)
24-30: Consider handling complex parameter values.

The buildCacheKey function converts params to strings using template literals, which would produce [object Object] for nested objects or inconsistent results for arrays. If the metrics API ever uses complex query parameters, this could cause cache key collisions or misses.
♻️ Suggested improvement
 function buildCacheKey(endpoint, params) {
   const sortedKeys = Object.keys(params).sort();
-  const paramStr = sortedKeys.map((k) => `${k}=${params[k]}`).join("&");
+  const paramStr = sortedKeys
+    .map((k) => `${k}=${JSON.stringify(params[k])}`)
+    .join("&");
   // Use simple encoding to avoid btoa issues with unicode
   const hash = paramStr ? `_${encodeURIComponent(paramStr)}` : "";
   return `${CACHE_PREFIX}${endpoint}${hash}`;
 }
backend/dashboard_metrics/management/commands/populate_test_metrics.py (1)
60-67: Consider chaining exceptions for better debugging.

When re-raising exceptions, chaining them preserves the original traceback which aids debugging.
♻️ Suggested improvement
         if org_id:
             try:
                 org = Organization.objects.get(id=org_id)
             except Organization.DoesNotExist:
-                raise CommandError(f"Organization with ID {org_id} not found")
+                raise CommandError(f"Organization with ID {org_id} not found") from None
         else:
             org = Organization.objects.first()
             if not org:
                 raise CommandError("No organizations found. Create one first.")
frontend/src/components/metrics-dashboard/MetricsChart.jsx (1)
266-292: Stabilize MultiSeriesChart metric order for consistent colors/legend.
metricNames comes from Object.entries(data.series) iteration order; sorting will keep line colors stable across runs and payload shapes.
Proposed diff
   const { chartData, metricNames } = useMemo(() => {
     if (!data?.series) return { chartData: [], metricNames: [] };

     const dateMap = new Map();
-    const names = [];
+    const names = [];

     Object.entries(data.series).forEach(([metricName, values]) => {
       names.push(metricName);
       values.forEach(({ date, value }) => {
         if (!dateMap.has(date)) {
           dateMap.set(date, { date });
         }
         dateMap.get(date)[metricName] = value;
       });
     });

     const sorted = Array.from(dateMap.values()).sort(
       (a, b) => new Date(a.date) - new Date(b.date)
     );

-    return { chartData: sorted, metricNames: names };
+    return { chartData: sorted, metricNames: names.sort() };
   }, [data]);
Also applies to: 341-353
backend/dashboard_metrics/models.py (1)

63-105: Consider making metric_count non-negative.
metric_count should never be negative; PositiveIntegerField (or PositiveBigIntegerField if large) better encodes the invariant.

Also applies to: 169-211, 275-317

unstract/core/src/unstract/core/metrics/__init__.py (1)

39-53: Ruff RUF022: __all__ ordering.
Not functionally important, but easy to clean up if you want to keep lint noise down.

unstract/core/src/unstract/core/metrics/backends/queue.py (2)

69-115: Retry/backoff here can stall request threads for several seconds when RabbitMQ is down.
If record() is called in synchronous web paths, the exponential sleeps (Line 111) can become user-facing latency. Consider making retry configurable per environment (or “fail fast” by default), and/or using Kombu’s publish retry policies instead of reconnect sleeps.

90-97: Swallowing connection-close errors hides useful diagnostics.
At least log at debug level so broker flaps are diagnosable.

backend/dashboard_metrics/views.py (4)

165-232: Bucket cache DB fallback is N-queries-per-missing-hour (cold cache over 7 days can be ~168 queries).
This will be expensive under load. Prefer one ranged query for the missing span and then bucket in Python.

391-401: Remove the inner from datetime import datetime (it’s redundant and easy to shadow).
You already import datetime at the top (Line 5); the inner import (Line 391) is unnecessary.

486-523: Monthly path uses TruncDay(month) and then returns “daily” shaped points; clarify granularity semantics.
If the UI requests granularity=week/day but source is monthly, you’ll still return one point per month-start day. If that’s intended, OK—otherwise consider forcing granularity to month when source=monthly (and exposing month in the serializer choices).

868-878: Cache health check key is global; use a namespaced/random key to avoid cross-worker interference.
Multiple instances probing the same key can produce false negatives.

backend/dashboard_metrics/services.py (1)

101-134: Sums can return NULL; consider Coalesce(..., 0) to avoid downstream None checks.
You already guard in get_all_metrics_summary for some metrics; pushing it into the query keeps call sites simpler.

Also applies to: 330-364

backend/dashboard_metrics/tasks.py (1)

577-580: Iterate orgs with .iterator() to avoid loading all org IDs into memory.
Especially if this runs every 15 minutes.

backend/dashboard_metrics/serializers.py (1)

11-72: Validation is solid; consider making naive datetimes explicitly timezone-aware.
Depending on DRF settings, DateTimeField may accept naive values; the views mix both patterns (overview makes aware). Centralizing this in the serializer avoids surprises.
backend/dashboard_metrics/cache.py (1)
101-105: Prefer SHA-256 over MD5 for cache key hashing (FIPS/security posture).
MD5 isn’t a security risk here, but it is often disallowed in hardened environments.
Proposed fix
-    params_hash = hashlib.md5(sorted_params.encode()).hexdigest()[:12]
+    params_hash = hashlib.sha256(sorted_params.encode()).hexdigest()[:12]

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to Reviews > Disable Cache setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 62b1c3b and cfc4dc5.

⛔ Files ignored due to path filters (2)

backend/uv.lock is excluded by !**/*.lock
frontend/src/assets/dashboard.svg is excluded by !**/*.svg

📒 Files selected for processing (36)

backend/backend/celery_config.py
backend/backend/settings/base.py
backend/dashboard_metrics/README.md
backend/dashboard_metrics/cache.py
backend/dashboard_metrics/capture.py
backend/dashboard_metrics/management/__init__.py
backend/dashboard_metrics/management/commands/__init__.py
backend/dashboard_metrics/management/commands/backfill_metrics.py
backend/dashboard_metrics/management/commands/generate_metrics_test_data.py
backend/dashboard_metrics/management/commands/populate_test_metrics.py
backend/dashboard_metrics/migrations/0001_initial.py
backend/dashboard_metrics/migrations/0002_setup_cleanup_tasks.py
backend/dashboard_metrics/migrations/0003_setup_aggregation_task.py
backend/dashboard_metrics/models.py
backend/dashboard_metrics/serializers.py
backend/dashboard_metrics/services.py
backend/dashboard_metrics/tasks.py
backend/dashboard_metrics/tests/test_tasks.py
backend/dashboard_metrics/urls.py
backend/dashboard_metrics/views.py
backend/pyproject.toml
frontend/package.json
frontend/src/components/metrics-dashboard/MetricsChart.jsx
frontend/src/components/metrics-dashboard/MetricsDashboard.css
frontend/src/components/metrics-dashboard/MetricsDashboard.jsx
frontend/src/components/metrics-dashboard/MetricsSummary.jsx
frontend/src/components/metrics-dashboard/MetricsTable.jsx
frontend/src/components/metrics-dashboard/index.js
frontend/src/components/navigations/side-nav-bar/SideNavBar.jsx
frontend/src/helpers/metricsCache.js
frontend/src/hooks/useMetricsData.js
frontend/src/pages/MetricsDashboardPage.jsx
frontend/src/routes/useMainAppRoutes.js
unstract/core/src/unstract/core/metrics/__init__.py
unstract/core/src/unstract/core/metrics/backends/noop.py
unstract/core/src/unstract/core/metrics/backends/queue.py

✅ Files skipped from review due to trivial changes (1)

frontend/src/components/metrics-dashboard/MetricsDashboard.css

🚧 Files skipped from review as they are similar to previous changes (4)

backend/pyproject.toml
backend/dashboard_metrics/tests/test_tasks.py
backend/dashboard_metrics/urls.py
backend/backend/settings/base.py

🧰 Additional context used

🧬 Code graph analysis (8)

frontend/src/components/metrics-dashboard/MetricsChart.jsx (1)

frontend/src/components/metrics-dashboard/MetricsDashboard.jsx (1)

Typography (34-34)

frontend/src/components/metrics-dashboard/MetricsDashboard.jsx (1)

frontend/src/components/metrics-dashboard/MetricsChart.jsx (2)

useMemo (130-164)

useMemo (267-291)

backend/dashboard_metrics/services.py (5)

backend/api_v2/models.py (1)

APIDeployment (43-154)

backend/pipeline_v2/models.py (1)

Pipeline (35-156)

backend/usage_v2/models.py (1)

Usage (26-81)

backend/workflow_manager/file_execution/models.py (1)

WorkflowFileExecution (68-250)

backend/workflow_manager/workflow_v2/models/execution.py (2)

WorkflowExecution (102-372)

organization_id (266-274)

backend/dashboard_metrics/serializers.py (2)

backend/dashboard_metrics/models.py (4)

EventMetricsHourly (39-142)

Meta (109-142)

Meta (215-248)

Meta (321-354)

backend/dashboard_metrics/views.py (2)

series (355-566)

summary (243-352)

backend/dashboard_metrics/management/commands/backfill_metrics.py (3)

backend/account_v2/models.py (1)

Organization (12-46)

backend/dashboard_metrics/models.py (4)

EventMetricsDaily (145-248)

EventMetricsHourly (39-142)

EventMetricsMonthly (251-354)

MetricType (14-18)

backend/dashboard_metrics/services.py (10)

MetricsQueryService (37-464)

get_documents_processed (67-99)

get_pages_processed (102-133)

get_llm_calls (136-169)

get_challenges (172-206)

get_summarization_calls (209-243)

get_deployed_api_requests (246-285)

get_etl_pipeline_executions (288-328)

get_llm_usage_cost (331-364)

get_prompt_executions (367-398)

unstract/core/src/unstract/core/metrics/backends/noop.py (3)

unstract/core/src/unstract/core/metrics/backends/base.py (3)

AbstractMetricBackend (8-40)

record (15-24)

flush (27-32)

unstract/core/src/unstract/core/metrics/types.py (1)

MetricEvent (52-96)

unstract/core/src/unstract/core/metrics/__init__.py (1)

record (58-119)

backend/dashboard_metrics/migrations/0001_initial.py (2)

backend/dashboard_metrics/migrations/0003_setup_aggregation_task.py (1)

Migration (40-51)

backend/dashboard_metrics/migrations/0002_setup_cleanup_tasks.py (1)

Migration (70-81)

unstract/core/src/unstract/core/metrics/__init__.py (5)

unstract/core/src/unstract/core/metrics/config.py (2)

get_backend (28-60)

reset_backend (63-68)

unstract/core/src/unstract/core/metrics/registry.py (3)

get_all_metrics (42-48)

get_metric_type (25-39)

validate_metric (13-22)

unstract/core/src/unstract/core/metrics/types.py (3)

MetricEvent (52-96)

MetricName (15-32)

MetricType (8-12)

unstract/core/src/unstract/core/metrics/backends/noop.py (1)

record (25-40)

unstract/core/src/unstract/core/metrics/backends/queue.py (1)

record (117-156)

🪛 markdownlint-cli2 (0.18.1)

backend/dashboard_metrics/README.md

15-15: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

26-26: Tables should be surrounded by blank lines

(MD058, blanks-around-tables)

47-47: Tables should be surrounded by blank lines

(MD058, blanks-around-tables)

54-54: Tables should be surrounded by blank lines

(MD058, blanks-around-tables)

86-86: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

187-187: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

299-299: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

482-482: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

512-512: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

712-712: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🪛 Ruff (0.14.11)

backend/dashboard_metrics/migrations/0003_setup_aggregation_task.py

6-6: Unused function argument: schema_editor

(ARG001)

33-33: Unused function argument: schema_editor

(ARG001)

41-44: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

46-51: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

backend/backend/celery_config.py

42-49: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

56-66: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

69-74: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

unstract/core/src/unstract/core/metrics/backends/queue.py

95-96: try-except-pass detected, consider logging the exception

(S110)

95-95: Do not catch blind exception: Exception

(BLE001)

101-101: Consider moving this statement to an else block

(TRY300)

103-103: Do not catch blind exception: Exception

(BLE001)

113-113: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

115-115: Avoid specifying long messages outside the exception class

(TRY003)

149-149: Consider moving this statement to an else block

(TRY300)

212-212: Do not catch blind exception: Exception

(BLE001)

backend/dashboard_metrics/models.py

113-130: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

131-142: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

219-236: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

237-248: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

325-342: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

343-354: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

backend/dashboard_metrics/management/commands/populate_test_metrics.py

23-33: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

53-53: Unused method argument: args

(ARG002)

63-63: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

63-63: Avoid specifying long messages outside the exception class

(TRY003)

67-67: Avoid specifying long messages outside the exception class

(TRY003)

91-91: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

117-117: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

167-167: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

172-172: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

174-174: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

176-176: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

177-177: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

184-186: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

187-187: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

190-190: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

193-193: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

backend/dashboard_metrics/tasks.py

186-186: Do not catch blind exception: Exception

(BLE001)

398-398: Unpacked variable obj is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

439-439: Unpacked variable obj is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

480-480: Unpacked variable obj is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

669-669: Do not catch blind exception: Exception

(BLE001)

692-692: Redundant exception object included in logging.exception call

(TRY401)

backend/dashboard_metrics/management/commands/generate_metrics_test_data.py

29-32: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

34-41: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

43-45: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

48-54: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

56-62: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

89-89: Unused method argument: args

(ARG002)

232-232: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

244-244: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

246-246: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

250-252: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

253-255: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

258-258: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

259-259: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

285-285: Unused method argument: start_time

(ARG002)

285-285: Unused method argument: end_time

(ARG002)

292-292: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

299-299: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

302-302: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

307-307: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

308-308: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

309-309: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

313-313: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

351-351: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

361-361: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

364-364: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

365-365: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

369-369: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

379-379: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

408-408: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

409-409: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

411-411: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

412-412: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

421-421: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

backend/dashboard_metrics/migrations/0002_setup_cleanup_tasks.py

6-6: Unused function argument: schema_editor

(ARG001)

58-58: Unused function argument: schema_editor

(ARG001)

71-74: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

76-81: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

backend/dashboard_metrics/serializers.py

121-132: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

backend/dashboard_metrics/capture.py

45-45: Consider moving this statement to an else block

(TRY300)

backend/dashboard_metrics/management/commands/backfill_metrics.py

35-49: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

85-85: Unused method argument: args

(ARG002)

258-258: Do not catch blind exception: Exception

(BLE001)

278-278: Unpacked variable obj is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

295-295: Do not catch blind exception: Exception

(BLE001)

309-309: Unpacked variable obj is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

326-326: Do not catch blind exception: Exception

(BLE001)

340-340: Unpacked variable obj is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

357-357: Do not catch blind exception: Exception

(BLE001)

unstract/core/src/unstract/core/metrics/backends/noop.py

25-25: Unused method argument: event

(ARG002)

backend/dashboard_metrics/migrations/0001_initial.py

13-15: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

17-381: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

backend/dashboard_metrics/cache.py

103-103: Probable use of insecure hash functions in hashlib: md5

(S324)

176-176: Consider moving this statement to an else block

(TRY300)

277-277: Do not catch blind exception: Exception

(BLE001)

470-470: Consider moving this statement to an else block

(TRY300)

510-510: Do not catch blind exception: Exception

(BLE001)

backend/dashboard_metrics/views.py

59-59: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

60-60: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

62-62: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

63-63: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

64-64: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

72-72: Abstract raise to an inner function

(TRY301)

72-72: Avoid specifying long messages outside the exception class

(TRY003)

78-78: Avoid specifying long messages outside the exception class

(TRY003)

84-84: Avoid specifying long messages outside the exception class

(TRY003)

235-235: Loop control variable bucket_ts not used within loop body

(B007)

237-237: Loop control variable bucket_ts not used within loop body

(B007)

679-679: Avoid specifying long messages outside the exception class

(TRY003)

737-737: Avoid specifying long messages outside the exception class

(TRY003)

830-830: Unused method argument: request

(ARG002)

863-863: Consider moving this statement to an else block

(TRY300)

875-875: Consider moving this statement to an else block

(TRY300)

unstract/core/src/unstract/core/metrics/__init__.py

39-53: __all__ is not sorted

Apply an isort-style sorting to __all__

(RUF022)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: build

backend/dashboard_metrics/cache.py

coderabbitai · 2026-01-13T18:23:03Z

backend/dashboard_metrics/management/commands/backfill_metrics.py

+    def _upsert_hourly(self, aggregations: dict) -> tuple[int, int]:
+        """Upsert hourly aggregations."""
+        created, updated = 0, 0
+        with transaction.atomic():
+            for key, agg in aggregations.items():
+                org_id, ts_str, metric_name, project, tag = key
+                timestamp = datetime.fromisoformat(ts_str)
+
+                try:
+                    obj, was_created = EventMetricsHourly._base_manager.update_or_create(
+                        organization_id=org_id,
+                        timestamp=timestamp,
+                        metric_name=metric_name,
+                        project=project,
+                        tag=tag,
+                        defaults={
+                            "metric_type": agg["metric_type"],
+                            "metric_value": agg["value"],
+                            "metric_count": agg["count"],
+                            "labels": agg["labels"],
+                        },
+                    )
+                    if was_created:
+                        created += 1
+                    else:
+                        updated += 1
+                except Exception as e:
+                    logger.warning(f"Error upserting hourly {key}: {e}")
+
+        return created, updated


⚠️ Potential issue | 🔴 Critical

Catching exceptions inside a single outer transaction.atomic() can break the whole backfill.
If any DB error occurs (e.g., IntegrityError), the transaction may become unusable and subsequent iterations can fail even though exceptions are caught.

Simplest fix: remove the outer transaction.atomic() (each update_or_create() is already atomic), or use a per-row savepoint.

Proposed fix (remove outer atomic + keep loop going safely)

def _upsert_hourly(self, aggregations: dict) -> tuple[int, int]: """Upsert hourly aggregations.""" created, updated = 0, 0 - with transaction.atomic(): - for key, agg in aggregations.items(): - org_id, ts_str, metric_name, project, tag = key - timestamp = datetime.fromisoformat(ts_str) + for key, agg in aggregations.items(): + org_id, ts_str, metric_name, project, tag = key + timestamp = datetime.fromisoformat(ts_str) - try: - obj, was_created = EventMetricsHourly._base_manager.update_or_create( + try: + _, was_created = EventMetricsHourly._base_manager.update_or_create( organization_id=org_id, timestamp=timestamp, metric_name=metric_name, project=project, tag=tag, defaults={ "metric_type": agg["metric_type"], "metric_value": agg["value"], "metric_count": agg["count"], "labels": agg["labels"], }, ) - if was_created: - created += 1 - else: - updated += 1 - except Exception as e: - logger.warning(f"Error upserting hourly {key}: {e}") + if was_created: + created += 1 + else: + updated += 1 + except Exception as e: + logger.warning(f"Error upserting hourly {key}: {e}") return created, updated

Apply the same pattern to daily/monthly; also prefix unused obj as _ (RUF059).

Also applies to: 300-329, 331-360

🧰 Tools

🪛 Ruff (0.14.11)

278-278: Unpacked variable obj is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

295-295: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents

In @backend/dashboard_metrics/management/commands/backfill_metrics.py around lines 269 - 298, The outer transaction.atomic() in _upsert_hourly wraps all update_or_create calls so a DB error (e.g., IntegrityError) can leave the transaction unusable and abort the rest of the loop; remove the outer transaction.atomic() (or alternatively use a per-row savepoint) so each EventMetricsHourly._base_manager.update_or_create call runs in its own atomic context and the loop can continue on exceptions; also change the unused returned variable name obj to _ and apply the same change (remove outer atomic or use per-row savepoints and rename obj → _) in the corresponding daily/monthly upsert methods that use update_or_create.

coderabbitai · 2026-01-13T18:23:03Z

backend/dashboard_metrics/management/commands/generate_metrics_test_data.py

+    def _clean_test_data(self, org_id_str: str, workflow: Workflow):
+        """Remove existing test data."""
+        self.stdout.write("Cleaning existing test data...")
+
+        # Delete file executions first (FK constraint)
+        file_exec_count = WorkflowFileExecution.objects.filter(
+            workflow_execution__workflow=workflow
+        ).count()
+        WorkflowFileExecution.objects.filter(
+            workflow_execution__workflow=workflow
+        ).delete()
+
+        # Delete workflow executions
+        exec_count = WorkflowExecution.objects.filter(workflow=workflow).count()
+        WorkflowExecution.objects.filter(workflow=workflow).delete()
+
+        # Delete usage records
+        usage_count = Usage.objects.filter(organization_id=org_id_str).count()
+        Usage.objects.filter(organization_id=org_id_str).delete()
+
+        # Delete page usage records
+        page_count = PageUsage.objects.filter(organization_id=org_id_str).count()
+        PageUsage.objects.filter(organization_id=org_id_str).delete()
+
+        self.stdout.write(
+            f"  Deleted: {exec_count} executions, {file_exec_count} file executions, "
+            f"{usage_count} usage, {page_count} page usage"
+        )
+


⚠️ Potential issue | 🔴 Critical

--clean is dangerously destructive (deletes real org data).
_clean_test_data() deletes all Usage + PageUsage for the org, not just data created by this command. This is too risky for staging/prod. Require an explicit --force/--i-understand-this-deletes-real-data flag and/or scope deletions to records created by this command (e.g., dedicated workflow name/tag/marker label).

coderabbitai · 2026-01-13T18:23:03Z

backend/dashboard_metrics/management/commands/generate_metrics_test_data.py

+    def _generate_workflow_executions(
+        self, workflow: Workflow, start_time, end_time, count: int
+    ) -> list[WorkflowExecution]:
+        """Generate WorkflowExecution records."""
+        executions = []
+        for _ in range(count):
+            created_at = self._random_timestamp(start_time, end_time)
+
+            # 90% completed, 10% error (matching staging ratio)
+            is_error = random.random() >= 0.9
+            status = ExecutionStatus.ERROR if is_error else ExecutionStatus.COMPLETED
+            error_message = random.choice(self.EXECUTION_ERRORS) if is_error else ""
+
+            exec_obj = WorkflowExecution(
+                workflow=workflow,
+                execution_mode=random.choice(
+                    [WorkflowExecution.Mode.INSTANT, WorkflowExecution.Mode.QUEUE]
+                ),
+                execution_method=random.choice(
+                    [WorkflowExecution.Method.DIRECT, WorkflowExecution.Method.SCHEDULED]
+                ),
+                execution_type=WorkflowExecution.Type.COMPLETE,
+                status=status,
+                total_files=random.randint(1, 10),
+                execution_time=random.uniform(5.0, 120.0),
+                error_message=error_message,
+            )
+            exec_obj._target_created_at = created_at  # Store target timestamp
+            executions.append(exec_obj)
+
+        # Bulk create
+        WorkflowExecution.objects.bulk_create(executions)
+
+        # Refresh to get IDs (needed since bulk_create may not return them)
+        created_executions = list(
+            WorkflowExecution.objects.filter(workflow=workflow).order_by("-id")[:count]
+        )
+
+        # Update created_at timestamps (auto_now_add overrides during bulk_create)
+        for i, exec_obj in enumerate(created_executions):
+            target_ts = executions[i]._target_created_at if i < len(executions) else None
+            if target_ts:
+                WorkflowExecution.objects.filter(id=exec_obj.id).update(
+                    created_at=target_ts
+                )
+                exec_obj.created_at = target_ts  # Update in-memory object too
+
+        return created_executions
+


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

find . -type f -name "*.py" | grep -E "(models|workflow)" | head -20

Repository: Zipstack/unstract

Length of output: 1019

🏁 Script executed:

find . -type f -name "*.py" -path "*/models*" | head -20

Repository: Zipstack/unstract

Length of output: 927

🏁 Script executed:

cat -n backend/dashboard_metrics/management/commands/generate_metrics_test_data.py | sed -n '235,283p'

Repository: Zipstack/unstract

Length of output: 2562

🏁 Script executed:

rg "class WorkflowExecution" -A 20 --type py

Repository: Zipstack/unstract

Length of output: 50373

🏁 Script executed:

cat -n backend/workflow_manager/workflow_v2/models/execution.py | head -100

Repository: Zipstack/unstract

Length of output: 4461

🏁 Script executed:

rg "created_at" backend/workflow_manager/workflow_v2/models/execution.py -B 2 -A 2

Repository: Zipstack/unstract

Length of output: 654

🏁 Script executed:

sed -n '1,150p' backend/workflow_manager/workflow_v2/models/execution.py | tail -80

Repository: Zipstack/unstract

Length of output: 2988

🏁 Script executed:

sed -n '100,250p' backend/workflow_manager/workflow_v2/models/execution.py

Repository: Zipstack/unstract

Length of output: 5347

🏁 Script executed:

find . -type f -name "base_model.py" | head -5

Repository: Zipstack/unstract

Length of output: 97

🏁 Script executed:

rg "class BaseModel" -A 30 --type py | head -80

Repository: Zipstack/unstract

Length of output: 453

🏁 Script executed:

rg "bulk_create\(" backend/ --type py -A 5 | head -60

Repository: Zipstack/unstract

Length of output: 4882

🏁 Script executed:

rg "bulk_update\(" backend/ --type py -A 3 | head -60

Repository: Zipstack/unstract

Length of output: 797

🏁 Script executed:

cat backend/requirements.txt | grep -i django

Repository: Zipstack/unstract

Length of output: 117

🏁 Script executed:

python3 << 'EOF' # Check Django bulk_create behavior documentation # In Django 1.10+, bulk_create returns a list of objects with PKs populated # However, this depends on database backend support import subprocess result = subprocess.run( ["grep", "-r", "bulk_create", "backend/", "--include=*.py"], capture_output=True, text=True ) # Look for any comments or patterns about bulk_create and PKs lines = result.stdout.split('\n') for i, line in enumerate(lines): if 'bulk_create' in line and ('id' in line or 'pk' in line or 'return' in line): print(f"{i}: {line}") EOF

Repository: Zipstack/unstract

Length of output: 147

🏁 Script executed:

sed -n '290,350p' backend/dashboard_metrics/management/commands/generate_metrics_test_data.py

Repository: Zipstack/unstract

Length of output: 2465

🏁 Script executed:

sed -n '150,210p' backend/dashboard_metrics/management/commands/generate_metrics_test_data.py

Repository: Zipstack/unstract

Length of output: 2359

🏁 Script executed:

sed -n '290,330p' backend/dashboard_metrics/management/commands/generate_metrics_test_data.py

Repository: Zipstack/unstract

Length of output: 1830

🏁 Script executed:

rg "class WorkflowFileExecution" -A 40 --type py | head -60

Repository: Zipstack/unstract

Length of output: 5126

🏁 Script executed:

rg "class WorkflowFileExecution" backend/ -A 50 --type py | head -80

Repository: Zipstack/unstract

Length of output: 6921

🏁 Script executed:

find . -type f -name "*.py" -path "*/models*" -exec grep -l "WorkflowFileExecution" {} \;

Repository: Zipstack/unstract

Length of output: 175

🏁 Script executed:

cat -n backend/workflow_manager/file_execution/models.py | head -100

Repository: Zipstack/unstract

Length of output: 4286

🏁 Script executed:

rg "created_at" backend/workflow_manager/file_execution/models.py -B 2 -A 2

Repository: Zipstack/unstract

Length of output: 436

🏁 Script executed:

python3 << 'EOF' # Write a test to understand the behavior test_code = """ import os import django os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'unstract.settings') django.setup() from workflow_manager.workflow_v2.models import WorkflowExecution, Workflow # Check Django version print(f"Django version: {django.__version__}") # Test if bulk_create returns objects with PKs populated # We'll look at the method signature import inspect bulk_create_sig = inspect.signature(WorkflowExecution.objects.bulk_create) print(f"bulk_create signature: {bulk_create_sig}") # Check if bulk_update exists and its signature if hasattr(WorkflowExecution.objects, 'bulk_update'): bulk_update_sig = inspect.signature(WorkflowExecution.objects.bulk_update) print(f"bulk_update signature: {bulk_update_sig}") """ # Write to file with open('/tmp/test_django_bulk.py', 'w') as f: f.write(test_code) print("Test file created at /tmp/test_django_bulk.py") EOF

Repository: Zipstack/unstract

Length of output: 106

🏁 Script executed:

find . -type f -name "setup.py" -o -name "pyproject.toml" -o -name "setup.cfg" | head -5

Repository: Zipstack/unstract

Length of output: 192

🏁 Script executed:

find . -name "setup.py" -o -name "pyproject.toml" | xargs cat 2>/dev/null | grep -i "django" | head -10

Repository: Zipstack/unstract

Length of output: 470

🏁 Script executed:

find . -path "*/requirements*" -type f | xargs cat 2>/dev/null | grep -i "django" | head -5

Repository: Zipstack/unstract

Length of output: 43

🏁 Script executed:

python3 << 'EOF' # Django 4.2.1 behavior verification # Check if bulk_update can update auto_now_add fields import_note = """ Django 4.2.1 facts: 1. bulk_create() DOES return objects with populated PKs (since Django 1.10+) 2. bulk_update() exists (since Django 3.2+) 3. auto_now_add fields: Django skips them in normal .save() but NOT in bulk_update() - However, auto_now_add fields are typically READ-ONLY at the ORM level - bulk_update() will raise an error if you try to update auto_now_add fields """ print(import_note) # The actual problem with the code: problem = """ ISSUE WITH CURRENT CODE: The line: WorkflowExecution.objects.filter(workflow=workflow).order_by("-id")[:count] Is problematic because: 1. If multiple workflows exist, this might fetch executions from OTHER workflows 2. The .order_by("-id")[:count] doesn't guarantee getting the SAME rows created in bulk_create 3. Example: If you have 10 total rows, and bulk_create 5 new ones: - order_by("-id")[:5] might get the 5 newest, but order might not match creation order - Mapping executions[i]._target_created_at to created_executions[i] becomes wrong ISSUE WITH PROPOSED SOLUTION: The proposed fix assumes bulk_update() can override auto_now_add fields, which it CANNOT. auto_now_add fields are protected and cannot be updated via Django ORM bulk_update(). """ print(problem) EOF

Repository: Zipstack/unstract

Length of output: 1205

🏁 Script executed:

python3 << 'EOF' # Check if bulk_update tries to update auto_now_add fields # In the proposed solution, it tries to update created_at which has auto_now_add=True # Looking at Django 4.2.1 documentation and source: # Fields with auto_now_add=True cannot be updated via bulk_update() # Django will either: # 1. Skip the field silently (pre-3.2) # 2. Raise ValidationError (4.2+) print(""" CRITICAL FINDING: In Django 4.2.1, the Model.Meta.readonly_fields and auto_now_add fields are generally protected from bulk updates. However, the current code uses a different approach: - It uses individual .update() queries: WorkflowExecution.objects.filter(id=exec_obj.id).update(created_at=target_ts) This BYPASSES the ORM protection because .update() at the QuerySet level directly executes SQL UPDATE statements, which DO allow setting auto_now_add fields. The PROPOSED solution tries to use bulk_update(), which would be BLOCKED by Django's ORM protection on auto_now_add fields. """) EOF

Repository: Zipstack/unstract

Length of output: 657

🏁 Script executed:

cat > /tmp/check_solution.py << 'EOF' # Simulate what the proposed solution does proposed_code = """ # Proposed fix from review: for exec_obj in executions: exec_obj.created_at = exec_obj._target_created_at WorkflowExecution.objects.bulk_update(executions, ["created_at"]) """ # The issue: # bulk_update() in Django 4.2.1 will NOT allow updating fields with auto_now_add=True # It will raise a ValidationError or skip the field print("PROPOSED SOLUTION PROBLEM:") print("=" * 60) print(proposed_code) print("\nWill FAIL because:") print("1. created_at has auto_now_add=True in BaseModel") print("2. bulk_update() respects ORM constraints and won't update auto_now_add fields") print("3. Django 4.2.1 strictly enforces readonly semantics") print("\nThe CURRENT code actually works because:") print("1. It uses QuerySet.update() which bypasses ORM constraints") print("2. Direct SQL UPDATE statements can modify auto_now_add columns") print("\nBUT the current code's REAL PROBLEM is:") print("1. order_by('-id')[:count] is unreliable") print("2. ID ordering doesn't guarantee getting the newly created rows") print("3. Order of re-fetched rows may not match order in original list") EOF python3 /tmp/check_solution.py

Repository: Zipstack/unstract

Length of output: 904

Bulk-create timestamp backfill is unreliable, but proposed solution won't work.

Re-querying WorkflowExecution.objects.filter(...).order_by("-id")[:count] is not guaranteed to fetch the newly created rows or preserve order. This can cause mismatches between _target_created_at values and the fetched executions.

However, the proposed bulk_update() solution will not work because created_at has auto_now_add=True, and Django 4.2.1 prevents bulk_update() from modifying auto_now_add fields. The current approach using individual QuerySet.update() calls bypasses this ORM protection, but the underlying re-query logic is the actual problem.

Use the PKs returned directly from bulk_create() and update them individually:

Corrected approach

# Bulk create returns objects with PKs already populated created_executions = WorkflowExecution.objects.bulk_create(executions) # Update created_at using the PKs we already have (no re-query needed) for exec_obj in created_executions: WorkflowExecution.objects.filter(id=exec_obj.id).update( created_at=exec_obj._target_created_at )

🧰 Tools

🪛 Ruff (0.14.11)

244-244: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

246-246: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

250-252: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

253-255: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

258-258: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

259-259: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

🤖 Prompt for AI Agents

In @backend/dashboard_metrics/management/commands/generate_metrics_test_data.py around lines 235 - 283, The re-query after bulk_create in _generate_workflow_executions is unreliable and may misalign _target_created_at values; instead use the list returned by WorkflowExecution.objects.bulk_create(executions) (these objects have PKs populated) and iterate that returned list to update each record's created_at via individual QuerySet.update(id=exec_obj.id) using exec_obj._target_created_at (do not attempt bulk_update because created_at uses auto_now_add).

coderabbitai · 2026-01-13T18:23:03Z

backend/dashboard_metrics/tasks.py

+    with transaction.atomic():
+        for key, agg in aggregations.items():
+            org_id, hour_ts_str, metric_name, project, tag = key
+            hour_ts = datetime.fromisoformat(hour_ts_str)
+
+            try:
+                obj, was_created = (
+                    EventMetricsHourly.objects.select_for_update().get_or_create(
+                        organization_id=org_id,
+                        timestamp=hour_ts,
+                        metric_name=metric_name,
+                        project=project,
+                        tag=tag,
+                        defaults={
+                            "metric_type": agg["metric_type"],
+                            "metric_value": agg["value"],
+                            "metric_count": agg["count"],
+                            "labels": agg["labels"],
+                        },
+                    )
+                )
+                if was_created:
+                    created += 1
+                else:
+                    obj.metric_value = F("metric_value") + agg["value"]
+                    obj.metric_count = F("metric_count") + agg["count"]
+                    if agg["labels"]:
+                        obj.labels = {**obj.labels, **agg["labels"]}
+                    obj.save(
+                        update_fields=[
+                            "metric_value",
+                            "metric_count",
+                            "labels",
+                            "modified_at",
+                        ]
+                    )
+                    updated += 1
+            except Exception:
+                logger.exception("Error upserting hourly metric %s", key)
+
+    return created, updated


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, let's check the file structure and find the tasks.py file fd -type f -name "tasks.py" | grep dashboard_metrics

Repository: Zipstack/unstract

Length of output: 231

🏁 Script executed:

# Get the full file to see all instances mentioned (222-262, 276-317, 330-371) wc -l backend/dashboard_metrics/tasks.py

Repository: Zipstack/unstract

Length of output: 99

🏁 Script executed:

# Read the relevant sections to understand the pattern cat -n backend/dashboard_metrics/tasks.py | sed -n '220,265p'

Repository: Zipstack/unstract

Length of output: 2159

🏁 Script executed:

# Also check the other instances cat -n backend/dashboard_metrics/tasks.py | sed -n '274,320p' cat -n backend/dashboard_metrics/tasks.py | sed -n '328,375p'

Repository: Zipstack/unstract

Length of output: 4390

🏁 Script executed:

# Find the EventMetricsHourly model to understand the unique constraints fd -type f -name "*.py" | xargs grep -l "class EventMetricsHourly" 2>/dev/null

Repository: Zipstack/unstract

Length of output: 233

🏁 Script executed:

# Search for EventMetricsHourly model definition find backend -name "*.py" -exec grep -l "class EventMetricsHourly" {} \;

Repository: Zipstack/unstract

Length of output: 137

🏁 Script executed:

# Also search for models.py in dashboard_metrics find backend/dashboard_metrics -name "*.py"

Repository: Zipstack/unstract

Length of output: 1068

🏁 Script executed:

# Check if there's a models file in the app ls -la backend/dashboard_metrics/

Repository: Zipstack/unstract

Length of output: 1032

🏁 Script executed:

# Read the models.py file to find the model definitions and unique constraints cat -n backend/dashboard_metrics/models.py

Repository: Zipstack/unstract

Length of output: 13484

🏁 Script executed:

# Also check if there are any database-level constraints in migrations head -100 backend/dashboard_metrics/migrations/0001_initial.py

Repository: Zipstack/unstract

Length of output: 3483

🏁 Script executed:

# Check if there's any test code that demonstrates the concurrency issue cat -n backend/dashboard_metrics/tests/test_tasks.py | head -100

Repository: Zipstack/unstract

Length of output: 3906

🏁 Script executed:

# Look for any imports or usage of IntegrityError in the tasks file grep -n "IntegrityError\|import\|from django" backend/dashboard_metrics/tasks.py | head -20

Repository: Zipstack/unstract

Length of output: 512

🏁 Script executed:

# Check the transaction imports and patterns sed -n '1,50p' backend/dashboard_metrics/tasks.py

Repository: Zipstack/unstract

Length of output: 1543

🏁 Script executed:

# Check what specific exceptions are being imported and used grep -n "IntegrityError\|DatabaseError\|OperationalError" backend/dashboard_metrics/tasks.py

Repository: Zipstack/unstract

Length of output: 350

🏁 Script executed:

# Look at the context around exception handling more broadly sed -n '215,265p' backend/dashboard_metrics/tasks.py

Repository: Zipstack/unstract

Length of output: 2022

🌐 Web query:

Django select_for_update get_or_create IntegrityError race condition concurrent insert

💡 Result:

Short answer

select_for_update won't stop concurrent inserts when the row doesn't exist yet — it only locks existing rows. That lets two transactions both decide to create and one raise IntegrityError on insert. See Django and MySQL/Postgres notes. [1][2][3]

Recommended patterns to avoid the race

Rely on a DB unique constraint + retry on IntegrityError (simple, portable).

Try get(); if not found, attempt create() inside transaction; on IntegrityError, re-get. This is effectively what Django's get_or_create does. [3][4]

Serialize using a lock at a higher granularity

Lock a parent/placeholder row with select_for_update() (must be inside transaction.atomic()) so concurrent txns block until the creator commits. Works when you can lock some existing row or a dedicated "singleton" row. [2][5]

Use advisory locks (Postgres) for arbitrary keys

Acquire pg_advisory_lock(...) for the key you want to create, then check/create. This prevents races without requiring a physical locking row. (Requires DB-specific code.) [6]

Use stricter isolation only with care

Changing isolation levels (e.g., to SERIALIZABLE) can prevent some anomalies but has trade-offs (retries, portability). The Django docs and tickets warn about relying on isolation-level changes. [1][2]

Minimal example (unique-constraint + retry)

Ensure a UNIQUE constraint on the identifying fields.

Pseudocode:
with transaction.atomic():
try:
obj = Model.objects.get(**lookup)
except Model.DoesNotExist:
try:
obj = Model.objects.create(**params)
except IntegrityError:
obj = Model.objects.get(**lookup)

References
[1] Django ticket discussing get_or_create and isolation-level issues.
[2] Django ticket and discussion: SELECT FOR UPDATE does nothing when row absent (race described).
[3] Django docs / tickets showing get_or_create relies on DB uniqueness and may raise IntegrityError; stackoverflow posts explaining behavior.
[4] Django ticket/implementation of get_or_create/_create_object_from_params (shows fallback on IntegrityError).
[5] StackOverflow: select_for_update must be inside transaction.atomic and locks an existing row.
[6] Postgres advisory locks pattern (use for key-based serialization).

If you want, tell me which database (Postgres/MySQL) and I’ll give a concise code example for the best pattern for that DB.

🏁 Script executed:

# Check what database backend is used find backend -name "settings*.py" -o -name "*.py" | xargs grep -l "DATABASE\|DATABASES" | head -5

Repository: Zipstack/unstract

Length of output: 457

🏁 Script executed:

# Look for Django settings to understand the database backend grep -r "ENGINE" backend/*/settings* 2>/dev/null || grep -r "postgresql\|mysql\|sqlite" backend/*/settings* 2>/dev/null | head -10

Repository: Zipstack/unstract

Length of output: 282

🏁 Script executed:

# Check if there are other examples of IntegrityError handling in the codebase grep -r "IntegrityError" backend/ --include="*.py" | head -20

Repository: Zipstack/unstract

Length of output: 1660

🏁 Script executed:

# Check the custom_db backend to understand the actual database engine cat backend/backend/custom_db.py 2>/dev/null || grep -A 10 "custom_db\|DATABASE" backend/backend/settings/base.py | head -20

Repository: Zipstack/unstract

Length of output: 557

🏁 Script executed:

# Check if postgres is used by looking at requirements or imports grep -r "psycopg\|postgresql" backend/ --include="*.txt" --include="*.py" | head -5

Repository: Zipstack/unstract

Length of output: 555

🏁 Script executed:

# Look at how IntegrityError is being handled in similar patterns (e.g., account_v2) sed -n '1,50p' backend/account_v2/organization.py | grep -A 10 "IntegrityError"

Repository: Zipstack/unstract

Length of output: 445

Catch IntegrityError explicitly to prevent silent data loss on concurrent upserts.

select_for_update() only locks existing rows. When two concurrent tasks insert the same (org, timestamp, metric, project, tag) key, the non-existent row cannot be locked, both attempt insert, one raises IntegrityError, and the generic Exception handler swallows it—losing the increment without returning an error signal.

Import IntegrityError and catch it explicitly with a retry (matching the pattern used elsewhere in the codebase in account_v2, connector_v2, pipeline_v2):

from django.db import IntegrityError try: obj, was_created = EventMetricsHourly.objects.select_for_update().get_or_create(...) except IntegrityError: # Row was created concurrently; fetch and update obj = EventMetricsHourly.objects.get(organization_id=org_id, timestamp=hour_ts, ...) was_created = False except Exception: logger.exception("Error upserting hourly metric %s", key) continue

Alternatively, for PostgreSQL, use INSERT ... ON CONFLICT DO UPDATE to atomically upsert without retry logic.

Also applies to: _bulk_upsert_daily (276–317), _bulk_upsert_monthly (330–371).

🤖 Prompt for AI Agents

In @backend/dashboard_metrics/tasks.py around lines 222 - 262, The current broad except in the upsert loop swallows IntegrityError from concurrent inserts causing lost increments; import django.db.IntegrityError and explicitly catch IntegrityError around the get_or_create in EventMetricsHourly upsert logic (the block that calls select_for_update().get_or_create and currently sets obj, was_created), then on IntegrityError re-query the row (EventMetricsHourly.objects.get(...) for the same organization_id, timestamp, metric_name, project, tag), set was_created=False and proceed to update counters and labels as in the existing else-path; keep the generic Exception handler for other errors but ensure you continue after logging; apply the same explicit IntegrityError catch-and-retry pattern to the _bulk_upsert_daily and _bulk_upsert_monthly blocks as well.

unstract/core/src/unstract/core/metrics/backends/queue.py

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@backend/dashboard_metrics/views.py`:
- Around line 845-848: The current parsing of limit uses limit =
min(int(request.query_params.get("limit", 10)), 50) which will raise a
ValueError for non-integer input and cause a 500; update the parsing in the view
(where organization = self._get_organization() is used and limit is computed) to
defensively parse the query param by catching ValueError/TypeError, falling back
to a safe default (e.g., 10), enforcing a minimum (e.g., 1) and maximum of 50,
and then assign the sanitized value to limit; reference the
request.query_params.get("limit", 10) expression and the limit variable when
making the change.

🧹 Nitpick comments (6)

backend/dashboard_metrics/views.py (1)

331-341: Avoid per-metric lookup for metric_type in summary.

The loop issues one query per metric to fetch metric_type. You can pull it in the initial aggregation to remove the N+1.

♻️ Suggested refactor

-            summary = (
-                queryset.values("metric_name")
-                .annotate(
-                    total_value=Sum("metric_value"),
-                    total_count=Sum("metric_count"),
-                    average_value=Avg("metric_value"),
-                    min_value=Min("metric_value"),
-                    max_value=Max("metric_value"),
-                )
-                .order_by("metric_name")
-            )
-
-            # Add metric_type from the first record of each metric
-            summary_list = []
-            for row in summary:
-                metric_type_record = (
-                    queryset.filter(metric_name=row["metric_name"])
-                    .values("metric_type")
-                    .first()
-                )
-                row["metric_type"] = (
-                    metric_type_record["metric_type"] if metric_type_record else "counter"
-                )
-                summary_list.append(row)
+            summary_list = list(
+                queryset.values("metric_name", "metric_type")
+                .annotate(
+                    total_value=Sum("metric_value"),
+                    total_count=Sum("metric_count"),
+                    average_value=Avg("metric_value"),
+                    min_value=Min("metric_value"),
+                    max_value=Max("metric_value"),
+                )
+                .order_by("metric_name")
+            )

frontend/src/components/metrics-dashboard/MetricsChart.jsx (5)

122-128: Clean up forEach callback to avoid returning a value.

The static analysis correctly identifies that Set.add() returns a value, and forEach callbacks should not return values. While this works, it's cleaner to use a block body.
♻️ Suggested fix
       data.daily_trend.forEach((item) => {
         if (item.metrics) {
-          Object.keys(item.metrics).forEach((key) => metricNamesSet.add(key));
+          Object.keys(item.metrics).forEach((key) => {
+            metricNamesSet.add(key);
+          });
         }
       });
179-197: Move static color constants to module level.

METRIC_BAR_COLORS, DEFAULT_BAR_COLORS, and getBarColor are recreated on every render. Since these are static, move them outside the component for minor performance improvement and consistency with TREND_COLORS at line 21.
♻️ Suggested refactor

Move these definitions after line 31 (after TREND_COLORS):
// Bar colors - metric-specific for key metrics, fallback array for others
const METRIC_BAR_COLORS = {
  pages_processed: "#1890ff",
  documents_processed: "#52c41a",
  failed_pages: "#ff4d4f",
  llm_calls: "#722ed1",
  prompt_executions: "#faad14",
  llm_usage: "#13c2c2",
};

const DEFAULT_BAR_COLORS = [
  "#1890ff",
  "#faad14",
  "#52c41a",
  "#722ed1",
  "#13c2c2",
];

const getBarColor = (metric, idx) =>
  METRIC_BAR_COLORS[metric] ||
  DEFAULT_BAR_COLORS[idx % DEFAULT_BAR_COLORS.length];
410-416: Same forEach callback issue as noted earlier.

Apply the same fix here for consistency.
♻️ Suggested fix
     data.daily_trend.forEach((item) => {
       if (item.metrics) {
-        Object.keys(item.metrics).forEach((key) => metricNamesSet.add(key));
+        Object.keys(item.metrics).forEach((key) => {
+          metricNamesSet.add(key);
+        });
       }
     });
779-792: Move TREND_BAR_COLORS and getTrendColor to module level.

Same issue as in MetricsChart — these static constants are recreated on every render.

396-551: Consider extracting shared chart logic into reusable utilities.

MetricsBreakdown and TrendAnalysisChart share nearly identical patterns:

Filter dropdown state and handlers (selectedMetrics, handleFilterClick, cardExtra)

Metric extraction from daily_trend

Loading/empty state rendering

A custom hook (e.g., useMetricFilter) and a wrapper component for the card chrome could reduce ~100 lines of duplication across these components. This is optional and can be deferred.

Also applies to: 674-848

coderabbitai · 2026-01-29T06:05:58Z

backend/dashboard_metrics/views.py

+        organization = self._get_organization()
+        org_id = str(organization.id)
+        limit = min(int(request.query_params.get("limit", 10)), 50)
+


⚠️ Potential issue | 🟡 Minor

Guard recent_activity limit parsing to avoid 500 on invalid input.

int() on a bad limit query param will raise and return a 500. A small guard keeps the endpoint stable.

🛠️ Suggested fix

- limit = min(int(request.query_params.get("limit", 10)), 50) + try: + limit = int(request.query_params.get("limit", 10)) + except (TypeError, ValueError): + return Response( + {"error": "limit must be an integer"}, + status=status.HTTP_400_BAD_REQUEST, + ) + limit = min(limit, 50)

🤖 Prompt for AI Agents

In `@backend/dashboard_metrics/views.py` around lines 845 - 848, The current parsing of limit uses limit = min(int(request.query_params.get("limit", 10)), 50) which will raise a ValueError for non-integer input and cause a 500; update the parsing in the view (where organization = self._get_organization() is used and limit is computed) to defensively parse the query param by catching ValueError/TypeError, falling back to a safe default (e.g., 10), enforcing a minimum (e.g., 1) and maximum of 50, and then assign the sanitized value to limit; reference the request.query_params.get("limit", 10) expression and the limit variable when making the change.

coderabbitai

Actionable comments posted: 4

🤖 Fix all issues with AI agents

In `@backend/dashboard_metrics/tasks.py`:
- Around line 586-694: The exception handlers inside the org-processing block
are currently catching all Exceptions (the inner try around the metric loop and
the outer try wrapping org processing), which swallows
DatabaseError/OperationalError and prevents Celery's autoretry; change both
handlers to re-raise DB errors so Celery can retry: in the inner except around
the metric loop (where logger.warning(...) and stats["errors"] += 1 happen),
detect and re-raise DatabaseError and OperationalError before handling other
exceptions, and in the outer except (where logger.exception(...) and
stats["errors"] += 1 happen) do the same; keep the existing logging/stats
increments for non-DB exceptions and ensure _upsert_hourly_replace,
_upsert_daily_replace, and _upsert_monthly_replace DB errors will propagate.

In `@frontend/src/components/metrics-dashboard/MetricsChart.jsx`:
- Around line 258-286: The chart can render blank when metricsToShow (computed
from selectedMetrics and availableMetrics) is empty; add an explicit early
return that checks metricsToShow.length === 0 and renders the same Card used for
other empty states (e.g., title "Trend Analysis (Last 30 Days)" with an Empty
component and a helpful description like "No metrics selected or available")
before the chart rendering logic so users see a clear empty state instead of a
blank chart.

In `@frontend/src/hooks/useMetricsData.js`:
- Around line 114-118: When orgId becomes falsy the effect should clear the
recent-activity state to avoid showing stale data; inside the useEffect that
currently calls fetchActivity() when orgId is truthy (the effect depending on
[orgId, limit]), add an else branch that resets your activity state (e.g., call
the state updater used to store recent activity such as setRecentActivity([]) or
setActivity([])) so recent activity is cleared whenever orgId is
null/undefined/empty.
- Around line 67-71: In useMetricsData's useEffect (the effect that depends on
[orgId, startDate, endDate]) ensure that when orgId is falsy you clear the
hook's cached metric state instead of simply skipping fetchOverview; add an else
branch that resets the relevant state variables (e.g., metrics/overview state,
loading/error flags) so stale data from the previous org is removed on
logout/org switch; update the effect that currently calls fetchOverview() to
conditionally call fetchOverview() when orgId is present and otherwise call the
state-reset logic.

🧹 Nitpick comments (1)

backend/dashboard_metrics/tasks.py (1)

396-411: Avoid unused obj from update_or_create.
Rename to _/_obj to address RUF059 and reduce lint noise.

♻️ Proposed cleanup

-                obj, was_created = EventMetricsHourly._base_manager.update_or_create(
+                _, was_created = EventMetricsHourly._base_manager.update_or_create(
                     organization_id=org_id,
                     timestamp=hour_ts,
                     metric_name=metric_name,
                     project=project,
                     tag=tag,
                     defaults={
                         "metric_type": agg["metric_type"],
                         "metric_value": agg["value"],
                         "metric_count": agg["count"],
                         "labels": agg["labels"],
                     },
                 )

-                obj, was_created = EventMetricsDaily._base_manager.update_or_create(
+                _, was_created = EventMetricsDaily._base_manager.update_or_create(
                     organization_id=org_id,
                     date=date_val,
                     metric_name=metric_name,
                     project=project,
                     tag=tag,
                     defaults={
                         "metric_type": agg["metric_type"],
                         "metric_value": agg["value"],
                         "metric_count": agg["count"],
                         "labels": agg["labels"],
                     },
                 )

-                obj, was_created = EventMetricsMonthly._base_manager.update_or_create(
+                _, was_created = EventMetricsMonthly._base_manager.update_or_create(
                     organization_id=org_id,
                     month=month_val,
                     metric_name=metric_name,
                     project=project,
                     tag=tag,
                     defaults={
                         "metric_type": agg["metric_type"],
                         "metric_value": agg["value"],
                         "metric_count": agg["count"],
                         "labels": agg["labels"],
                     },
                 )

Also applies to: 438-452, 479-492

coderabbitai · 2026-02-02T03:49:41Z

backend/dashboard_metrics/tasks.py

+        try:
+            for metric_name, query_method, is_histogram in metric_configs:
+                metric_type = MetricType.HISTOGRAM if is_histogram else MetricType.COUNTER
+
+                try:
+                    # === HOURLY AGGREGATION (last 24 hours) ===
+                    hourly_results = query_method(
+                        org_id, hourly_start, end_date, granularity="hour"
+                    )
+                    for row in hourly_results:
+                        period = row["period"]
+                        value = row["value"] or 0
+                        hour_ts = _truncate_to_hour(period)
+                        key = (org_id, hour_ts.isoformat(), metric_name, "default", None)
+
+                        if key not in hourly_agg:
+                            hourly_agg[key] = {
+                                "metric_type": metric_type,
+                                "value": 0,
+                                "count": 0,
+                                "labels": {},
+                            }
+                        hourly_agg[key]["value"] += value
+                        hourly_agg[key]["count"] += 1
+
+                    # === DAILY AGGREGATION (last 7 days) ===
+                    daily_results = query_method(
+                        org_id, daily_start, end_date, granularity="day"
+                    )
+                    for row in daily_results:
+                        period = row["period"]
+                        value = row["value"] or 0
+                        day_ts = _truncate_to_day(period)
+                        key = (
+                            org_id,
+                            day_ts.date().isoformat(),
+                            metric_name,
+                            "default",
+                            None,
+                        )
+
+                        if key not in daily_agg:
+                            daily_agg[key] = {
+                                "metric_type": metric_type,
+                                "value": 0,
+                                "count": 0,
+                                "labels": {},
+                            }
+                        daily_agg[key]["value"] += value
+                        daily_agg[key]["count"] += 1
+
+                    # === MONTHLY AGGREGATION (last 2 months) ===
+                    monthly_results = query_method(
+                        org_id, monthly_start, end_date, granularity="day"
+                    )
+                    # Group results by month and create separate records
+                    monthly_buckets: dict[str, dict] = {}
+                    for row in monthly_results:
+                        period = row["period"]
+                        value = row["value"] or 0
+                        month_ts = _truncate_to_month(period)
+                        month_key_str = month_ts.date().isoformat()
+
+                        if month_key_str not in monthly_buckets:
+                            monthly_buckets[month_key_str] = {"value": 0, "count": 0}
+                        monthly_buckets[month_key_str]["value"] += value
+                        monthly_buckets[month_key_str]["count"] += 1
+
+                    # Create aggregation entries for each month
+                    for month_key_str, bucket in monthly_buckets.items():
+                        month_key = (
+                            org_id,
+                            month_key_str,
+                            metric_name,
+                            "default",
+                            None,
+                        )
+                        monthly_agg[month_key] = {
+                            "metric_type": metric_type,
+                            "value": bucket["value"],
+                            "count": bucket["count"] or 1,
+                            "labels": {},
+                        }
+
+                except Exception as e:
+                    logger.warning(f"Error querying {metric_name} for org {org_id}: {e}")
+                    stats["errors"] += 1
+
+            # Upsert all three tiers
+            if hourly_agg:
+                c, u = _upsert_hourly_replace(hourly_agg)
+                stats["hourly"]["created"] += c
+                stats["hourly"]["updated"] += u
+
+            if daily_agg:
+                c, u = _upsert_daily_replace(daily_agg)
+                stats["daily"]["created"] += c
+                stats["daily"]["updated"] += u
+
+            if monthly_agg:
+                c, u = _upsert_monthly_replace(monthly_agg)
+                stats["monthly"]["created"] += c
+                stats["monthly"]["updated"] += u
+
+            stats["orgs_processed"] += 1
+
+        except Exception as e:
+            logger.exception(f"Error processing org {org_id}: {e}")
+            stats["errors"] += 1


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

git ls-files backend/dashboard_metrics/tasks.py

Repository: Zipstack/unstract

Length of output: 95

🏁 Script executed:

wc -l backend/dashboard_metrics/tasks.py

Repository: Zipstack/unstract

Length of output: 99

🏁 Script executed:

sed -n '586,694p' backend/dashboard_metrics/tasks.py

Repository: Zipstack/unstract

Length of output: 4835

🏁 Script executed:

# Check for DatabaseError and OperationalError imports in the file rg -n "DatabaseError|OperationalError" backend/dashboard_metrics/tasks.py

Repository: Zipstack/unstract

Length of output: 350

🏁 Script executed:

# Check for Celery task decorator and autoretry_for configuration rg -n "autoretry_for|@.*task|@celery|@app\.task" backend/dashboard_metrics/tasks.py

Repository: Zipstack/unstract

Length of output: 358

🏁 Script executed:

# Find the function definition that contains lines 586-694 sed -n '500,520p' backend/dashboard_metrics/tasks.py

Repository: Zipstack/unstract

Length of output: 837

🏁 Script executed:

# Show lines around where the exception handlers are sed -n '703,730p' backend/dashboard_metrics/tasks.py

Repository: Zipstack/unstract

Length of output: 966

🏁 Script executed:

# Get the exact line numbers for both exception handlers sed -n '660,695p' backend/dashboard_metrics/tasks.py

Repository: Zipstack/unstract

Length of output: 1377

🏁 Script executed:

# Get more context to see where the outer try starts sed -n '550,600p' backend/dashboard_metrics/tasks.py

Repository: Zipstack/unstract

Length of output: 2161

🏁 Script executed:

# Check what _upsert_hourly_replace, _upsert_daily_replace, _upsert_monthly_replace do rg -n "_upsert_hourly_replace|_upsert_daily_replace|_upsert_monthly_replace" backend/dashboard_metrics/tasks.py | head -20

Repository: Zipstack/unstract

Length of output: 459

🏁 Script executed:

# Get the implementation of these upsert functions sed -n '300,500p' backend/dashboard_metrics/tasks.py | grep -A 30 "_upsert_hourly_replace\|_upsert_daily_replace\|_upsert_monthly_replace"

Repository: Zipstack/unstract

Length of output: 3707

🏁 Script executed:

# Check if there are more exception handlers in the upsert functions sed -n '373,420p' backend/dashboard_metrics/tasks.py

Repository: Zipstack/unstract

Length of output: 1896

🏁 Script executed:

# Look for transaction handling rg -n "transaction.atomic" backend/dashboard_metrics/tasks.py

Repository: Zipstack/unstract

Length of output: 270

🏁 Script executed:

# Check the full outer exception handler context sed -n '690,700p' backend/dashboard_metrics/tasks.py

Repository: Zipstack/unstract

Length of output: 556

🏁 Script executed:

# Verify the exact line numbers of the exception handlers in the reviewed code sed -n '586,694p' backend/dashboard_metrics/tasks.py | grep -n "except"

Repository: Zipstack/unstract

Length of output: 209

🏁 Script executed:

# Check if DatabaseError and OperationalError are actually caught by Exception python3 << 'EOF' from django.db.utils import DatabaseError, OperationalError # DatabaseError and OperationalError inherit from Exception print(f"DatabaseError is subclass of Exception: {issubclass(DatabaseError, Exception)}") print(f"OperationalError is subclass of Exception: {issubclass(OperationalError, Exception)}") print(f"DatabaseError MRO: {DatabaseError.__mro__}") print(f"OperationalError MRO: {OperationalError.__mro__}") EOF

Repository: Zipstack/unstract

Length of output: 179

🏁 Script executed:

# Search for documentation on Django DatabaseError and OperationalError inheritance web_search "Django DatabaseError OperationalError exception hierarchy"

Repository: Zipstack/unstract

Length of output: 111

🏁 Script executed:

# Look at the full context of what happens between the inner and outer exception handlers sed -n '670,694p' backend/dashboard_metrics/tasks.py

Repository: Zipstack/unstract

Length of output: 988

🏁 Script executed:

# Check the upsert functions' exception handling to understand the full flow sed -n '410,420p' backend/dashboard_metrics/tasks.py

Repository: Zipstack/unstract

Length of output: 333

🏁 Script executed:

# Verify the structure of the task and see if there are any other database operations rg -n "EventMetrics|update_or_create|transaction" backend/dashboard_metrics/tasks.py | head -30

Repository: Zipstack/unstract

Length of output: 1153

🏁 Script executed:

# Get the exact upsert exception handling sed -n '415,420p' backend/dashboard_metrics/tasks.py

Repository: Zipstack/unstract

Length of output: 195

🏁 Script executed:

# Verify the code structure around the outer exception handler to confirm the flow sed -n '572,600p' backend/dashboard_metrics/tasks.py

Repository: Zipstack/unstract

Length of output: 1170

🏁 Script executed:

# Final confirmation: show the task definition with autoretry_for sed -n '503,512p' backend/dashboard_metrics/tasks.py

Repository: Zipstack/unstract

Length of output: 393

Let DB errors propagate to Celery's autoretry mechanism.

The task has autoretry_for=(DatabaseError, OperationalError) configured, but the nested except Exception handlers at lines 671 and 693 catch these errors before they can trigger Celery's automatic retry. This means transient database errors won't be retried, and partial aggregates can be committed. Separate database errors from generic exceptions and re-raise them.

Suggested fix

- except Exception as e: - logger.warning(f"Error querying {metric_name} for org {org_id}: {e}") - stats["errors"] += 1 + except (DatabaseError, OperationalError): + logger.exception( + "DB error querying %s for org %s", metric_name, org_id + ) + raise + except Exception as e: + logger.warning( + "Error querying %s for org %s: %s", metric_name, org_id, e + ) + stats["errors"] += 1 @@ - except Exception as e: - logger.exception(f"Error processing org {org_id}: {e}") - stats["errors"] += 1 + except (DatabaseError, OperationalError): + logger.exception("DB error processing org %s", org_id) + raise + except Exception as e: + logger.exception("Error processing org %s: %s", org_id, e) + stats["errors"] += 1

🧰 Tools

🪛 Ruff (0.14.14)

[warning] 670-670: Do not catch blind exception: Exception

(BLE001)

[warning] 693-693: Redundant exception object included in logging.exception call

(TRY401)

🤖 Prompt for AI Agents

In `@backend/dashboard_metrics/tasks.py` around lines 586 - 694, The exception handlers inside the org-processing block are currently catching all Exceptions (the inner try around the metric loop and the outer try wrapping org processing), which swallows DatabaseError/OperationalError and prevents Celery's autoretry; change both handlers to re-raise DB errors so Celery can retry: in the inner except around the metric loop (where logger.warning(...) and stats["errors"] += 1 happen), detect and re-raise DatabaseError and OperationalError before handling other exceptions, and in the outer except (where logger.exception(...) and stats["errors"] += 1 happen) do the same; keep the existing logging/stats increments for non-DB exceptions and ensure _upsert_hourly_replace, _upsert_daily_replace, and _upsert_monthly_replace DB errors will propagate.

coderabbitai · 2026-02-02T03:49:41Z

frontend/src/components/metrics-dashboard/MetricsChart.jsx

+  // Filter to only show selected metrics that exist
+  const metricsToShow = useMemo(() => {
+    return selectedMetrics.filter((m) => availableMetrics.includes(m));
+  }, [selectedMetrics, availableMetrics]);
+
+  if (loading) {
+    return (
+      <Card
+        title="Trend Analysis (Last 30 Days)"
+        className="metrics-chart-card"
+      >
+        <div className="metrics-loading">
+          <Spin />
+        </div>
+      </Card>
+    );
+  }
+
+  if (!chartData.length) {
+    return (
+      <Card
+        title="Trend Analysis (Last 30 Days)"
+        className="metrics-chart-card"
+      >
+        <Empty description="No data available" />
+      </Card>
+    );
+  }
+


⚠️ Potential issue | 🟡 Minor

Handle “no selected metrics” to avoid a blank chart.

If selectedMetrics don’t exist in availableMetrics, metricsToShow becomes empty and the chart renders without lines. Show an explicit empty state so users understand what happened.

✅ Suggested fix

if (!chartData.length) { return ( <Card title="Trend Analysis (Last 30 Days)" className="metrics-chart-card" > <Empty description="No data available" /> </Card> ); } + + if (!metricsToShow.length) { + return ( + <Card + title="Trend Analysis (Last 30 Days)" + className="metrics-chart-card" + > + <Empty description="No metrics selected" /> + </Card> + ); + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Filter to only show selected metrics that exist

const metricsToShow = useMemo(() => {

return selectedMetrics.filter((m) => availableMetrics.includes(m));

}, [selectedMetrics, availableMetrics]);

if (loading) {

return (

<Card

title="Trend Analysis (Last 30 Days)"

className="metrics-chart-card"

>

<div className="metrics-loading">

<Spin />

</div>

</Card>

);

}

if (!chartData.length) {

return (

<Card

title="Trend Analysis (Last 30 Days)"

className="metrics-chart-card"

>

<Empty description="No data available" />

</Card>

);

}

// Filter to only show selected metrics that exist

const metricsToShow = useMemo(() => {

return selectedMetrics.filter((m) => availableMetrics.includes(m));

}, [selectedMetrics, availableMetrics]);

if (loading) {

return (

<Card

title="Trend Analysis (Last 30 Days)"

className="metrics-chart-card"

>

<div className="metrics-loading">

<Spin />

</div>

</Card>

);

}

if (!chartData.length) {

return (

<Card

title="Trend Analysis (Last 30 Days)"

className="metrics-chart-card"

>

<Empty description="No data available" />

</Card>

);

}

if (!metricsToShow.length) {

return (

<Card

title="Trend Analysis (Last 30 Days)"

className="metrics-chart-card"

>

<Empty description="No metrics selected" />

</Card>

);

}

🤖 Prompt for AI Agents

In `@frontend/src/components/metrics-dashboard/MetricsChart.jsx` around lines 258 - 286, The chart can render blank when metricsToShow (computed from selectedMetrics and availableMetrics) is empty; add an explicit early return that checks metricsToShow.length === 0 and renders the same Card used for other empty states (e.g., title "Trend Analysis (Last 30 Days)" with an Empty component and a helpful description like "No metrics selected or available") before the chart rendering logic so users see a clear empty state instead of a blank chart.

coderabbitai · 2026-02-02T03:49:41Z

frontend/src/hooks/useMetricsData.js

+  useEffect(() => {
+    if (orgId) {
+      fetchOverview();
+    }
+  }, [orgId, startDate, endDate]); // eslint-disable-line


⚠️ Potential issue | 🟠 Major

Clear cached UI state when orgId is missing to avoid cross‑org data bleed.

When orgId becomes falsy, the hook skips fetching but keeps the previous org’s data in state. This can surface stale metrics after logout/org switch. Reset state in the useEffect else branch.

✅ Suggested fix

useEffect(() => { if (orgId) { fetchOverview(); + } else { + setData(null); + setError(null); + setLoading(false); } }, [orgId, startDate, endDate]); // eslint-disable-line

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

useEffect(() => {

if (orgId) {

fetchOverview();

}

}, [orgId, startDate, endDate]); // eslint-disable-line

useEffect(() => {

if (orgId) {

fetchOverview();

} else {

setData(null);

setError(null);

setLoading(false);

}

}, [orgId, startDate, endDate]); // eslint-disable-line

🤖 Prompt for AI Agents

In `@frontend/src/hooks/useMetricsData.js` around lines 67 - 71, In useMetricsData's useEffect (the effect that depends on [orgId, startDate, endDate]) ensure that when orgId is falsy you clear the hook's cached metric state instead of simply skipping fetchOverview; add an else branch that resets the relevant state variables (e.g., metrics/overview state, loading/error flags) so stale data from the previous org is removed on logout/org switch; update the effect that currently calls fetchOverview() to conditionally call fetchOverview() when orgId is present and otherwise call the state-reset logic.

coderabbitai · 2026-02-02T03:49:41Z

frontend/src/hooks/useMetricsData.js

+  useEffect(() => {
+    if (orgId) {
+      fetchActivity();
+    }
+  }, [orgId, limit]); // eslint-disable-line


⚠️ Potential issue | 🟠 Major

Reset recent activity state when orgId is unavailable.

Same issue as overview: stale activity can remain visible if orgId drops to null. Clear state in the effect when orgId is falsy.

✅ Suggested fix

useEffect(() => { if (orgId) { fetchActivity(); + } else { + setData(null); + setError(null); + setLoading(false); } }, [orgId, limit]); // eslint-disable-line

🤖 Prompt for AI Agents

In `@frontend/src/hooks/useMetricsData.js` around lines 114 - 118, When orgId becomes falsy the effect should clear the recent-activity state to avoid showing stale data; inside the useEffect that currently calls fetchActivity() when orgId is truthy (the effect depending on [orgId, limit]), add an else branch that resets your activity state (e.g., call the state updater used to store recent activity such as setRecentActivity([]) or setActivity([])) so recent activity is cleared whenever orgId is null/undefined/empty.

- Implement multi-tier metrics aggregation with celery-batches - Add hourly/daily/monthly storage tables with proper indexing - Add dedicated metrics Celery worker (worker-metrics) - Add live query endpoints for immediate data access - Add production readiness: retry config, rate limiting, health checks - Add queue routing and broker resilience configuration - Add comprehensive unit tests for batch processing 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

for more information, see https://pre-commit.ci

Backend: - Add EventMetricsHourly/Daily/Monthly models for pre-aggregated data - Add MetricsQueryService for querying source tables (Usage, PageUsage) - Add aggregate_from_sources Celery task (runs every 15 min) - Add cleanup tasks for hourly (30 days) and daily (365 days) retention - Add /overview/, /usage/, /activity/ API endpoints - Fix Celery task queue routing to dashboard_metric_events - Use _base_manager to bypass org context filter in Celery tasks Frontend: - Add MetricsDashboard component with tabs (Overview, Usage, Activity) - Add MetricsChart, MetricsSummary, MetricsTable components - Add useMetricsData hook with caching - Add recharts dependency for visualizations - Add dashboard.svg icon and sidebar navigation - Support date range picker for all tabs Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

for more information, see https://pre-commit.ci

- Remove redundant Detailed View tab and MetricsTable component - Add Recent Activity component showing real-time file executions - Differentiates by type: ETL, API, Workflow - Shows status, file name, and relative timestamps - Convert charts to line charts for 30-day visualization - Create PagesChart (pages_processed + failed_pages) - Create TrendAnalysisChart (other metrics with filter dropdown) - Date picker now only affects summary cards, charts fixed at 30 days - Add recent-activity API endpoint with proper URL registration - Fix organization filter queries in services.py Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Reduce default limit from 10 to 5 items - Add max-height with overflow scroll for compact display Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Cleanup after dashboard refactoring: - Remove unused chart components (MetricsChart, MetricsBreakdown, MultiSeriesChart, MetricsComparison) - Remove unused hooks (useMetricsData, useMetricsSummary, useMetricsSeries) - Remove unused imports (BarChart, Bar, Progress) - Remove unused CSS styles (.usage-quota-*, .trend-*, .breakdown-*) - Update exports to only include used components - Add failed_pages to metrics aggregation task Reduces bundle size by ~982 lines of dead code. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

github-actions · 2026-02-02T04:23:11Z

Test Results

Summary

✅ Runner Tests: 11 passed, 0 failed (11 total)
✅ SDK1 Tests: 66 passed, 0 failed (66 total)

Runner Tests - Full Report

filepath	function	$$\textcolor{#23d18b}{\tt{passed}}$$	SUBTOTAL
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_logs}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_cleanup}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_cleanup\_skip}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_client\_init}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_image\_exists}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_image}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_container\_run\_config}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_container\_run\_config\_without\_mount}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_run\_container}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_image\_for\_sidecar}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_sidecar\_container}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{TOTAL}}$$		$$\textcolor{#23d18b}{\tt{11}}$$	$$\textcolor{#23d18b}{\tt{11}}$$

SDK1 Tests - Full Report

sonarqubecloud · 2026-02-02T04:23:39Z

Quality Gate failed

Failed conditions
32 Security Hotspots
5.6% Duplication on New Code (required ≤ 3%)
D Reliability Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

backend/pyproject.toml (1)

15-20: ⚠️ Potential issue | 🟠 Major

Update celery-batches to 0.10 or later for Celery 5.3.x and Python 3.12 compatibility.

celery-batches 0.9.0 targets Celery 5.0–5.1 and does not support Celery 5.3.x. For Celery 5.3.x and Python 3.12, use celery-batches>=0.10 (or latest 0.11). No known security advisories exist for this package.

🤖 Fix all issues with AI agents

In `@backend/dashboard_metrics/management/commands/backfill_metrics.py`:
- Around line 202-256: The code currently increments the per-key "count" by 1 in
hourly_agg/daily_agg/monthly_agg which yields bucket counts instead of
source-record counts; update the increments in the loops in backfill_metrics.py
(the blocks that update hourly_agg[key]["count"], daily_agg[key]["count"], and
monthly_agg[key]["count"]) to use an explicit row-level count value from the
query result (e.g., row.get("count")) falling back to 1 if missing: set
row_count = row.get("count", None) and then hourly_agg[key]["count"] +=
row_count if row_count is not None else (value if metric_type == "counter" else
1); apply the same change for the daily and monthly aggregation loops so
averages use source-record counts (and confirm MetricsQueryService returns a
"count" field).

In `@backend/dashboard_metrics/tasks.py`:
- Around line 121-129: The event handler is passing a None tag into model
upserts which conflicts with the EventMetrics* CharField (null=False); normalize
tag to a non-null string (e.g., "") right after reading it from the event
(replace tag = event.get("tag") with a normalized value) and use that normalized
variable in all subsequent get_or_create()/update_or_create() calls so the model
default is not overridden by None.

In `@frontend/src/components/metrics-dashboard/MetricsDashboard.jsx`:
- Around line 26-70: chartStart/chartEnd are frozen by useMemo at mount so
refetchChart() keeps using stale dates; update handleRefresh to compute a fresh
30-day range and pass it to the chart refetch, e.g. compute const newStart =
dayjs().subtract(30, "day").toISOString() and const newEnd =
dayjs().toISOString() inside handleRefresh and call refetchChart(newStart,
newEnd). If refetchChart currently doesn't accept parameters, modify
useMetricsOverview(chartStart, chartEnd) to return a refetch that accepts
optional (start,end) overrides and use those in handleRefresh; touch the symbols
chartStart, chartEnd, refetchChart, useMetricsOverview, and handleRefresh.

In `@frontend/src/components/metrics-dashboard/RecentActivity.jsx`:
- Around line 91-125: The List rendering of dataSource={data.activity} lacks a
stable key causing reconciliation warnings; update the List component (the one
using dataSource, renderItem, and STATUS_CONFIG/TYPE_CONFIG) to include
rowKey="id" so React uses each activity's stable id field for keys (ensure the
id property on items is used as the unique key).

🧹 Nitpick comments (4)

unstract/core/src/unstract/core/metrics/registry.py (1)

42-48: Return a stable order from get_all_metrics().

Converting a set to a list yields nondeterministic ordering; sorting avoids flaky output and makes results predictable.
♻️ Suggested fix
-    return list(_REGISTERED_METRICS)
+    return sorted(_REGISTERED_METRICS)

backend/backend/celery_config.py (1)

1-74: Annotate mutable class attributes as ClassVar to satisfy Ruff.
This clears RUF012 warnings and makes the config intent explicit.

♻️ Suggested fix

-from urllib.parse import quote_plus
+from urllib.parse import quote_plus
+from typing import ClassVar
 
 from django.conf import settings
 from kombu import Queue
@@
-    broker_transport_options = {
+    broker_transport_options: ClassVar[dict[str, object]] = {
         "max_retries": 3,
         "interval_start": 0,
         "interval_step": 0.2,
         "interval_max": 1,
         "socket_timeout": 30,
         "socket_keepalive": True,
     }
@@
-    task_queues = [
+    task_queues: ClassVar[list[Queue]] = [
         Queue("celery", routing_key="celery"),
         Queue("dashboard_metric_events", routing_key="dashboard_metric_events"),
         Queue("celery_periodic_logs", routing_key="celery_periodic_logs"),
         Queue("celery_log_task_queue", routing_key="celery_log_task_queue"),
         Queue("celery_api_deployments", routing_key="celery_api_deployments"),
         Queue("file_processing", routing_key="file_processing"),
         Queue("api_file_processing", routing_key="api_file_processing"),
         Queue("file_processing_callback", routing_key="file_processing_callback"),
         Queue("api_file_processing_callback", routing_key="api_file_processing_callback"),
     ]
@@
-    task_routes = {
+    task_routes: ClassVar[dict[str, dict[str, str]]] = {
         "dashboard_metrics.process_events": {"queue": "dashboard_metric_events"},
         "dashboard_metrics.aggregate_from_sources": {"queue": "dashboard_metric_events"},
         "dashboard_metrics.cleanup_hourly_data": {"queue": "dashboard_metric_events"},
         "dashboard_metrics.cleanup_daily_data": {"queue": "dashboard_metric_events"},
     }

backend/dashboard_metrics/capture.py (1)

79-88: Avoid mutating caller-supplied labels.
labels or {} followed by in-place updates will modify the caller’s dict. Copy before extending and apply the same pattern in all record_* methods.
♻️ Suggested pattern
-        metric_labels = labels or {}
+        metric_labels = dict(labels) if labels else {}

backend/dashboard_metrics/management/commands/populate_test_metrics.py (1)

53-77: Consider chaining the original exception for better debugging.

When re-raising CommandError, chain the original exception to preserve the stack trace.
🔧 Suggested improvement
         try:
             org = Organization.objects.get(id=org_id)
         except Organization.DoesNotExist:
-            raise CommandError(f"Organization with ID {org_id} not found")
+            raise CommandError(f"Organization with ID {org_id} not found") from None

coderabbitai · 2026-02-02T04:37:46Z

backend/dashboard_metrics/management/commands/backfill_metrics.py

+                for row in hourly_results:
+                    period = row["period"]
+                    value = row["value"] or 0
+                    hour_ts = self._truncate_to_hour(period)
+                    key = (org_id, hour_ts.isoformat(), metric_name, "default", None)
+
+                    if key not in hourly_agg:
+                        hourly_agg[key] = {
+                            "metric_type": metric_type,
+                            "value": 0,
+                            "count": 0,
+                            "labels": {},
+                        }
+                    hourly_agg[key]["value"] += value
+                    hourly_agg[key]["count"] += 1
+
+                # Query daily data
+                daily_results = query_method(
+                    org_id, start_date, end_date, granularity="day"
+                )
+                for row in daily_results:
+                    period = row["period"]
+                    value = row["value"] or 0
+                    day_date = period.date() if hasattr(period, "date") else period
+                    key = (org_id, day_date.isoformat(), metric_name, "default", None)
+
+                    if key not in daily_agg:
+                        daily_agg[key] = {
+                            "metric_type": metric_type,
+                            "value": 0,
+                            "count": 0,
+                            "labels": {},
+                        }
+                    daily_agg[key]["value"] += value
+                    daily_agg[key]["count"] += 1
+
+                # Query for monthly (aggregate daily results by month)
+                for row in daily_results:
+                    period = row["period"]
+                    value = row["value"] or 0
+                    if hasattr(period, "date"):
+                        month_date = period.replace(day=1).date()
+                    else:
+                        month_date = period.replace(day=1)
+                    key = (org_id, month_date.isoformat(), metric_name, "default", None)
+
+                    if key not in monthly_agg:
+                        monthly_agg[key] = {
+                            "metric_type": metric_type,
+                            "value": 0,
+                            "count": 0,
+                            "labels": {},
+                        }
+                    monthly_agg[key]["value"] += value
+                    monthly_agg[key]["count"] += 1


⚠️ Potential issue | 🟠 Major

Verify metric_count represents source-record counts, not bucket counts.

Right now metric_count increments by 1 per aggregated row. If MetricsQueryService already returns per-period aggregates (likely, given the granularity parameter), this makes metric_count equal the number of time buckets, which will skew averages (total_value / metric_count). Please confirm the query shape and, if it already aggregates, use an explicit row count (or for counter metrics, the aggregated count) instead of +1.

Suggested adjustment (use a row-level count when available)

- value = row["value"] or 0 + value = row["value"] or 0 + row_count = row.get("count", 1) hour_ts = self._truncate_to_hour(period) key = (org_id, hour_ts.isoformat(), metric_name, "default", None) @@ - hourly_agg[key]["count"] += 1 + hourly_agg[key]["count"] += row_count

🤖 Prompt for AI Agents

In `@backend/dashboard_metrics/management/commands/backfill_metrics.py` around lines 202 - 256, The code currently increments the per-key "count" by 1 in hourly_agg/daily_agg/monthly_agg which yields bucket counts instead of source-record counts; update the increments in the loops in backfill_metrics.py (the blocks that update hourly_agg[key]["count"], daily_agg[key]["count"], and monthly_agg[key]["count"]) to use an explicit row-level count value from the query result (e.g., row.get("count")) falling back to 1 if missing: set row_count = row.get("count", None) and then hourly_agg[key]["count"] += row_count if row_count is not None else (value if metric_type == "counter" else 1); apply the same change for the daily and monthly aggregation loops so averages use source-record counts (and confirm MetricsQueryService returns a "count" field).

coderabbitai · 2026-02-02T04:37:46Z

backend/dashboard_metrics/tasks.py

+            timestamp = event.get("timestamp", timezone.now().timestamp())
+            org_id = event["org_id"]
+            metric_name = event["metric_name"]
+            metric_value = event.get("metric_value", 1)
+            metric_type = event.get("metric_type", MetricType.COUNTER)
+            labels = event.get("labels", {})
+            project = event.get("project", "default")
+            tag = event.get("tag")
+


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

find . -name "*.py" -path "*/dashboard_metrics/*" -type f | head -20

Repository: Zipstack/unstract

Length of output: 1073

🏁 Script executed:

cat -n backend/dashboard_metrics/tasks.py | sed -n '110,135p'

Repository: Zipstack/unstract

Length of output: 1228

🏁 Script executed:

find . -name "*.py" -path "*/dashboard_metrics/*" | xargs grep -l "class EventMetrics"

Repository: Zipstack/unstract

Length of output: 141

🏁 Script executed:

cat -n backend/dashboard_metrics/models.py | head -150

Repository: Zipstack/unstract

Length of output: 5707

🏁 Script executed:

grep -n "class EventMetrics" backend/dashboard_metrics/models.py

Repository: Zipstack/unstract

Length of output: 467

🏁 Script executed:

cat -n backend/dashboard_metrics/tasks.py | sed -n '128,160p'

Repository: Zipstack/unstract

Length of output: 1593

🏁 Script executed:

grep -n "update_or_create\|get_or_create" backend/dashboard_metrics/tasks.py

Repository: Zipstack/unstract

Length of output: 588

🏁 Script executed:

cat -n backend/dashboard_metrics/tasks.py | sed -n '225,245p'

Repository: Zipstack/unstract

Length of output: 1037

🏁 Script executed:

cat -n backend/dashboard_metrics/tasks.py | sed -n '395,415p'

Repository: Zipstack/unstract

Length of output: 1028

🏁 Script executed:

cat -n backend/dashboard_metrics/tasks.py | sed -n '279,299p'

Repository: Zipstack/unstract

Length of output: 1037

🏁 Script executed:

cat -n backend/dashboard_metrics/models.py | sed -n '145,210p'

Repository: Zipstack/unstract

Length of output: 2652

🏁 Script executed:

cat -n backend/dashboard_metrics/models.py | sed -n '251,320p'

Repository: Zipstack/unstract

Length of output: 2799

🏁 Script executed:

cat -n backend/dashboard_metrics/tasks.py | sed -n '333,350p'

Repository: Zipstack/unstract

Length of output: 935

🏁 Script executed:

cat -n backend/dashboard_metrics/tasks.py | sed -n '220,225p'

Repository: Zipstack/unstract

Length of output: 338

🏁 Script executed:

grep -r "IntegrityError" backend/dashboard_metrics/ --include="*.py"

Repository: Zipstack/unstract

Length of output: 43

🏁 Script executed:

cat -n backend/dashboard_metrics/tests/test_tasks.py | head -100

Repository: Zipstack/unstract

Length of output: 3906

🏁 Script executed:

grep -n "tag" backend/dashboard_metrics/tests/test_tasks.py | head -30

Repository: Zipstack/unstract

Length of output: 267

🏁 Script executed:

cat -n backend/dashboard_metrics/tasks.py | sed -n '100,130p'

Repository: Zipstack/unstract

Length of output: 1345

Normalize tag to a non-null value before upserts.

The EventMetrics* models define tag as a CharField with null=False (implicit) and default="". When event["tag"] is missing, event.get("tag") returns None, which is passed explicitly to get_or_create() and update_or_create() at lines 234, 288, 342, and 403. This overrides the model default and causes an IntegrityError when Django attempts to persist None to the NOT NULL database column, dropping events.

Suggested fix

- tag = event.get("tag") + tag = event.get("tag") or ""

🤖 Prompt for AI Agents

In `@backend/dashboard_metrics/tasks.py` around lines 121 - 129, The event handler is passing a None tag into model upserts which conflicts with the EventMetrics* CharField (null=False); normalize tag to a non-null string (e.g., "") right after reading it from the event (replace tag = event.get("tag") with a normalized value) and use that normalized variable in all subsequent get_or_create()/update_or_create() calls so the model default is not overridden by None.

coderabbitai · 2026-02-02T04:37:46Z

frontend/src/components/metrics-dashboard/MetricsDashboard.jsx

+  // Fixed 30-day range for activity charts (independent of date picker)
+  const chartStart = useMemo(
+    () => dayjs().subtract(30, "day").toISOString(),
+    []
+  );
+  const chartEnd = useMemo(() => dayjs().toISOString(), []);
+
+  // API hooks - pass date range to overview (for summary cards)
+  const {
+    data: overviewData,
+    loading: overviewLoading,
+    error: overviewError,
+    refetch: refetchOverview,
+  } = useMetricsOverview(
+    dateRange[0]?.toISOString(),
+    dateRange[1]?.toISOString()
+  );
+
+  // Fixed 30-day data for charts
+  const {
+    data: chartData,
+    loading: chartLoading,
+    refetch: refetchChart,
+  } = useMetricsOverview(chartStart, chartEnd);
+
+  // Recent activity (real-time, no date filter)
+  const {
+    data: activityData,
+    loading: activityLoading,
+    refetch: refetchActivity,
+  } = useRecentActivity(5);
+
+  // Handle date range change
+  const handleDateChange = useCallback((dates) => {
+    if (dates && dates.length === 2) {
+      setDateRange(dates);
+    }
+  }, []);
+
+  // Handle refresh
+  const handleRefresh = useCallback(() => {
+    refetchOverview();
+    refetchChart();
+    refetchActivity();
+  }, [refetchOverview, refetchChart, refetchActivity]);


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# Find the useMetricsOverview hook fd -t f "useMetricsOverview" --type f

Repository: Zipstack/unstract

Length of output: 43

🏁 Script executed:

# Search for useMetricsOverview in the codebase rg "useMetricsOverview" -t jsx -t js -A 5 -B 2

Repository: Zipstack/unstract

Length of output: 88

🏁 Script executed:

# Look for hooks directory fd -t d "hooks" frontend/src --max-depth 3

Repository: Zipstack/unstract

Length of output: 80

🏁 Script executed:

# Search for useMetricsOverview in hooks rg "useMetricsOverview" frontend/src/hooks/ -A 10

Repository: Zipstack/unstract

Length of output: 1011

🏁 Script executed:

# Also search the entire frontend directory rg "useMetricsOverview" frontend/ -A 10

Repository: Zipstack/unstract

Length of output: 3900

🏁 Script executed:

# Read the full useMetricsData.js file cat frontend/src/hooks/useMetricsData.js

Repository: Zipstack/unstract

Length of output: 3758

🏁 Script executed:

# Search for MetricsDashboard tests fd -t f "MetricsDashboard" frontend/ | grep -i test

Repository: Zipstack/unstract

Length of output: 43

🏁 Script executed:

# Check for any documentation or comments about rolling vs fixed windows rg "rolling|30.day|30 day" frontend/src/components/metrics-dashboard/ -i

Repository: Zipstack/unstract

Length of output: 1133

🏁 Script executed:

# Read MetricsDashboard.jsx to see full context sed -n '1,100p' frontend/src/components/metrics-dashboard/MetricsDashboard.jsx

Repository: Zipstack/unstract

Length of output: 3022

Refresh doesn't advance the 30-day chart window.
chartStart/chartEnd are frozen at mount with empty dependency arrays, so refetchChart() reuses the same range. Since the UI labels ("Last 30 Days") and presets indicate a rolling window, the chart data should reflect current last-30-days on refresh, not dates from component mount time.

Update the range on refresh to align with user expectations:

Suggested fix

- // Fixed 30-day range for activity charts (independent of date picker) - const chartStart = useMemo( - () => dayjs().subtract(30, "day").toISOString(), - [] - ); - const chartEnd = useMemo(() => dayjs().toISOString(), []); + // Rolling 30-day range for activity charts (independent of date picker) + const [chartRange, setChartRange] = useState(() => [ + dayjs().subtract(30, "day"), + dayjs(), + ]); + const chartStart = useMemo( + () => chartRange[0].toISOString(), + [chartRange] + ); + const chartEnd = useMemo(() => chartRange[1].toISOString(), [chartRange]); @@ - const handleRefresh = useCallback(() => { - refetchOverview(); - refetchChart(); - refetchActivity(); - }, [refetchOverview, refetchChart, refetchActivity]); + const handleRefresh = useCallback(() => { + setChartRange([dayjs().subtract(30, "day"), dayjs()]); + refetchOverview(); + refetchChart(); + refetchActivity(); + }, [refetchOverview, refetchChart, refetchActivity]);

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Fixed 30-day range for activity charts (independent of date picker)

const chartStart = useMemo(

() => dayjs().subtract(30, "day").toISOString(),

[]

);

const chartEnd = useMemo(() => dayjs().toISOString(), []);

// API hooks - pass date range to overview (for summary cards)

const {

data: overviewData,

loading: overviewLoading,

error: overviewError,

refetch: refetchOverview,

} = useMetricsOverview(

dateRange[0]?.toISOString(),

dateRange[1]?.toISOString()

);

// Fixed 30-day data for charts

const {

data: chartData,

loading: chartLoading,

refetch: refetchChart,

} = useMetricsOverview(chartStart, chartEnd);

// Recent activity (real-time, no date filter)

const {

data: activityData,

loading: activityLoading,

refetch: refetchActivity,

} = useRecentActivity(5);

// Handle date range change

const handleDateChange = useCallback((dates) => {

if (dates && dates.length === 2) {

setDateRange(dates);

}

}, []);

// Handle refresh

const handleRefresh = useCallback(() => {

refetchOverview();

refetchChart();

refetchActivity();

}, [refetchOverview, refetchChart, refetchActivity]);

// Rolling 30-day range for activity charts (independent of date picker)

const [chartRange, setChartRange] = useState(() => [

dayjs().subtract(30, "day"),

dayjs(),

]);

const chartStart = useMemo(

() => chartRange[0].toISOString(),

[chartRange]

);

const chartEnd = useMemo(() => chartRange[1].toISOString(), [chartRange]);

// API hooks - pass date range to overview (for summary cards)

const {

data: overviewData,

loading: overviewLoading,

error: overviewError,

refetch: refetchOverview,

} = useMetricsOverview(

dateRange[0]?.toISOString(),

dateRange[1]?.toISOString()

);

// Fixed 30-day data for charts

const {

data: chartData,

loading: chartLoading,

refetch: refetchChart,

} = useMetricsOverview(chartStart, chartEnd);

// Recent activity (real-time, no date filter)

const {

data: activityData,

loading: activityLoading,

refetch: refetchActivity,

} = useRecentActivity(5);

// Handle date range change

const handleDateChange = useCallback((dates) => {

if (dates && dates.length === 2) {

setDateRange(dates);

}

}, []);

// Handle refresh

const handleRefresh = useCallback(() => {

setChartRange([dayjs().subtract(30, "day"), dayjs()]);

refetchOverview();

refetchChart();

refetchActivity();

}, [refetchOverview, refetchChart, refetchActivity]);

🤖 Prompt for AI Agents

In `@frontend/src/components/metrics-dashboard/MetricsDashboard.jsx` around lines 26 - 70, chartStart/chartEnd are frozen by useMemo at mount so refetchChart() keeps using stale dates; update handleRefresh to compute a fresh 30-day range and pass it to the chart refetch, e.g. compute const newStart = dayjs().subtract(30, "day").toISOString() and const newEnd = dayjs().toISOString() inside handleRefresh and call refetchChart(newStart, newEnd). If refetchChart currently doesn't accept parameters, modify useMetricsOverview(chartStart, chartEnd) to return a refetch that accepts optional (start,end) overrides and use those in handleRefresh; touch the symbols chartStart, chartEnd, refetchChart, useMetricsOverview, and handleRefresh.

coderabbitai · 2026-02-02T04:37:46Z

frontend/src/components/metrics-dashboard/RecentActivity.jsx

+      <List
+        className="recent-activity-list"
+        dataSource={data.activity}
+        size="small"
+        split={false}
+        renderItem={(item) => {
+          const statusConfig =
+            STATUS_CONFIG[item.status] || STATUS_CONFIG.QUEUED;
+          const typeConfig = TYPE_CONFIG[item.type] || TYPE_CONFIG.workflow;
+
+          return (
+            <List.Item className="recent-activity-item">
+              <div className="recent-activity-content">
+                <div className="recent-activity-header">
+                  <span
+                    className="recent-activity-type"
+                    style={{ color: typeConfig.color }}
+                  >
+                    {typeConfig.icon}
+                    <Text strong style={{ marginLeft: 6 }}>
+                      {typeConfig.label}
+                    </Text>
+                  </span>
+                  <Tag color={statusConfig.color}>{statusConfig.label}</Tag>
+                </div>
+                <Text className="recent-activity-file" ellipsis>
+                  {item.file_name}
+                </Text>
+                <Text type="secondary" className="recent-activity-time">
+                  {dayjs(item.created_at).fromNow()}
+                </Text>
+              </div>
+            </List.Item>
+          );
+        }}


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's locate and read the RecentActivity.jsx file find . -name "RecentActivity.jsx" -type f

Repository: Zipstack/unstract

Length of output: 123

🏁 Script executed:

# Search for the RecentActivity component and understand its context rg "RecentActivity" --type jsx -l

Repository: Zipstack/unstract

Length of output: 88

🌐 Web query:

Ant Design List component rowKey prop best practice React key usage renderItem

💡 Result:

Ant Design <List /> + rowKey: best practice (and how it relates to React key)

1) Prefer rowKey when you use dataSource

When you render an Ant Design List from dataSource, let Ant Design assign the React keys by providing rowKey (string field name or function). AntD’s API explicitly defines rowKey as the item’s unique value (a React.Key) and defaults to "key". [1]

Recommended

<List dataSource={users} rowKey="id" // stable, unique renderItem={(user) => <List.Item>{user.name}</List.Item>} />

or:

<List dataSource={users} rowKey={(user) => user.id} // stable, unique renderItem={(user) => <List.Item>{user.name}</List.Item>} />

If your items already have item.key, you can rely on the default (rowKey="key"). [1]

2) Don’t generate keys inside renderItem

Avoid patterns like key={Math.random()} / key={uuid()} during render—keys must be stable across renders for React’s reconciliation to work correctly. React recommends using a value that uniquely identifies the item among its siblings (often an ID from your data). [2]

3) Avoid array index keys except for truly static lists

React allows using the array index only as a last resort; it’s not recommended if items can be inserted/removed/reordered because it can cause incorrect UI/state reuse. [2]

4) If you don’t use dataSource, you must provide keys yourself

If you manually render children (instead of dataSource), then you’re responsible for key on the elements you map (normal React rule). [2]

{users.map((u) => ( <List.Item key={u.id}>{u.name}</List.Item> ))}

Rule of thumb: for AntD List with dataSource, set rowKey to a stable unique ID; don’t also “fight it” by inventing keys inside renderItem.

Sources: Ant Design List API (rowKey) [1]; React “Lists and Keys” guidance [2].

🏁 Script executed:

# Read the RecentActivity.jsx file cat -n ./frontend/src/components/metrics-dashboard/RecentActivity.jsx | head -150

Repository: Zipstack/unstract

Length of output: 5010

🏁 Script executed:

# Search for where data.activity is defined or populated rg "data\.activity|activity" frontend/src/components/metrics-dashboard/ -A 3

Repository: Zipstack/unstract

Length of output: 7572

🏁 Script executed:

# Look for type definitions or data models related to activity fd "activity|Activity" frontend/src --type ts --type tsx -o | head -20

Repository: Zipstack/unstract

Length of output: 230

Add a stable rowKey to the List component.

The List component uses dataSource={data.activity} without specifying rowKey, which can trigger React reconciliation warnings and less efficient diffing. The activity items have a stable id field (confirmed in PropTypes); use rowKey="id" to ensure proper key-based rendering.

🛠️ Suggested fix

<List className="recent-activity-list" dataSource={data.activity} size="small" split={false} + rowKey="id" renderItem={(item) => {

🤖 Prompt for AI Agents

In `@frontend/src/components/metrics-dashboard/RecentActivity.jsx` around lines 91 - 125, The List rendering of dataSource={data.activity} lacks a stable key causing reconciliation warnings; update the List component (the one using dataSource, renderItem, and STATUS_CONFIG/TYPE_CONFIG) to include rowKey="id" so React uses each activity's stable id field for keys (ensure the id property on items is used as the unique key).

athul-rs marked this pull request as draft December 29, 2025 19:32

athul-rs requested review from chandrasekharan-zipstack and muhammad-ali-e January 5, 2026 04:28

athul-rs marked this pull request as ready for review January 5, 2026 04:28

coderabbitai bot reviewed Jan 5, 2026

View reviewed changes

chandrasekharan-zipstack reviewed Jan 5, 2026

View reviewed changes

athul-rs changed the title ~~[FEAT] Add dashboard metrics system with production readiness~~ [FEAT] Dashboard metrics system Jan 8, 2026

coderabbitai bot reviewed Jan 13, 2026

View reviewed changes

coderabbitai bot reviewed Jan 29, 2026

View reviewed changes

coderabbitai bot reviewed Feb 2, 2026

View reviewed changes

athul-rs and others added 8 commits February 2, 2026 09:28

Commit uv.lock changes

727e4cd

[pre-commit.ci] auto fixes from pre-commit.com hooks

4f9d42f

for more information, see https://pre-commit.ci

[pre-commit.ci] auto fixes from pre-commit.com hooks

d3cc7be

for more information, see https://pre-commit.ci

UN-1798: Limit Recent Activity to 5 items with scroll

42b1fb4

- Reduce default limit from 10 to 5 items - Add max-height with overflow scroll for compact display Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

athul-rs force-pushed the UN-1798-simple-dashboard branch from acb9e9c to ce6bba3 Compare February 2, 2026 04:22

athul-rs requested a review from chandrasekharan-zipstack February 2, 2026 04:22

coderabbitai bot reviewed Feb 2, 2026

View reviewed changes

		worker_prefetch_multiplier = 4
		worker_max_tasks_per_child = 1000

		from ..types import MetricEvent
		from .base import AbstractMetricBackend

[FEAT] Dashboard metrics system #1724

Are you sure you want to change the base?

[FEAT] Dashboard metrics system #1724

Uh oh!

Conversation

athul-rs commented Dec 29, 2025

What

Why

How

Can this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)

Database Migrations

Env Config

Relevant Docs

Related Issues or PRs

Dependencies Versions

Notes on Testing

Screenshots

Checklist

Uh oh!

coderabbitai bot commented Dec 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

chandrasekharan-zipstack left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Jan 13, 2026

Test Results

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 13, 2026

coderabbitai bot commented Dec 29, 2025 •

edited

Loading

Ant Design `<List />` + `rowKey`: best practice (and how it relates to React `key`)

1) Prefer `rowKey` when you use `dataSource`

2) Don’t generate keys inside `renderItem`