Skip to content

feat: Implement Cohort & Retention Analysis API (#623)#661

Merged
joelpeace48-cell merged 2 commits into
FinesseStudioLab:mainfrom
Williams-1604:feat/cohort-retention-api-623
Jun 18, 2026
Merged

feat: Implement Cohort & Retention Analysis API (#623)#661
joelpeace48-cell merged 2 commits into
FinesseStudioLab:mainfrom
Williams-1604:feat/cohort-retention-api-623

Conversation

@Williams-1604

Copy link
Copy Markdown
Contributor

🎯 Overview

This PR implements a comprehensive cohort and retention analysis API for the Trivela platform, enabling campaign operators to answer questions like "of users who registered in week N, how many claimed by week N+k?"

🔗 Issue

Closes #623

✨ Features Implemented

1. Database Schema (Migration 011)

  • user_activities: Tracks all user events (registered, claimed, active)
  • cohort_stats: Precomputed cohort statistics for performance
  • retention_data: Precomputed retention curves with offset tracking

2. Data Access Layer

  • Repository (sqliteCohortRepository.js): Complete CRUD for cohort data
  • Efficient indexed queries on campaign_id, activity_type, occurred_at
  • Support for cache invalidation and recomputation

3. Business Logic Service

  • Cohort computation: Groups users by registration period
  • Retention calculation: Tracks activity at offset periods
  • Multiple granularities: day, week (ISO 8601), month
  • Multiple metrics: claimed, active
  • Deterministic outputs: Same inputs always produce same results
  • Caching: Precompute and cache for fast queries

4. REST API Endpoints

All under /api/v1/campaigns/:campaignId/cohorts (API key required):

Cohort Analysis:

  • GET /cohorts - Full cohort analysis with retention curves
    • Query: granularity (day/week/month), metric (claimed/active), recompute (bool)
  • GET /cohorts/:cohortPeriod/retention - Specific cohort retention curve
    • Query: granularity, metric

Recomputation:

  • POST /cohorts/recompute - Force cache invalidation and recomputation
    • Query: granularity, metric

Activity Recording:

  • POST /activities - Record user activity (for testing/manual entry)
    • Body: { userAddress, activityType, occurredAt?, metadata? }

5. Comprehensive Testing

  • Deterministic fixture tests with hand-computed expected values
  • All 8 cohort service tests passing
  • 100% coverage of granularities and metric types
  • Tests verify exact retention rates (e.g., 66.67%, 50.00%)

📋 Files Changed

New Files

  • backend/src/db/migrations/011_cohort_retention_tables.js - Database schema
  • backend/src/dal/sqliteCohortRepository.js - Data access layer (241 lines)
  • backend/src/services/cohortService.js - Business logic (377 lines)
  • backend/src/routes/cohorts.js - API routes (208 lines)
  • backend/src/services/cohortService.test.js - Unit tests (247 lines)
  • IMPLEMENTATION_ISSUE_623.md - Complete documentation

Modified Files

  • backend/src/dal/index.js - Integrated cohort repository
  • backend/src/index.js - Registered cohort service and routes

🔄 Technical Highlights

Deterministic Cohort Assignment

Users assigned to cohorts based on registration timestamp:

registrationDate  getPeriodString(date, granularity)  cohortPeriod
// Example: "2024-01-15" → "2024-W03" (week granularity)

Retention Offset Calculation

offset = calculateOffset(cohortPeriod, activityPeriod, granularity)
// Week 1 cohort, Week 3 activity → offset = 2

Retention Rate Formula

retentionRate = (usersActiveAtOffset / cohortSize) × 100%

Period Handling

  • UTC timezone for all timestamps
  • ISO 8601 weeks (first Thursday rule)
  • Inclusive start, exclusive end for period boundaries

💡 Usage Example

# Get weekly cohorts with claim retention
curl "http://localhost:3001/api/v1/campaigns/1/cohorts?granularity=week&metric=claimed" \
  -H "X-API-Key: your-key"

# Response:
{
  "campaignId": "1",
  "granularity": "week",
  "metricType": "claimed",
  "cohorts": [
    {
      "cohortPeriod": "2024-W01",
      "cohortSize": 150,
      "periodStart": "2024-01-01T00:00:00.000Z",
      "periodEnd": "2024-01-08T00:00:00.000Z",
      "retention": [
        { "offset": 0, "userCount": 100, "retentionRate": 66.67 },
        { "offset": 1, "userCount": 75, "retentionRate": 50.00 },
        { "offset": 2, "userCount": 45, "retentionRate": 30.00 }
      ]
    }
  ]
}

✅ Testing & Verification

Deterministic Fixture Test

Hand-computed expected values verified:

  • Week 1 cohort: 3 users registered
  • Offset 0: 2 users claimed (66.67% retention) ✅
  • Offset 1: 1 user claimed (33.33% retention) ✅
  • Offset 2: 1 user claimed (33.33% retention) ✅

Test Results

✅ cohortService - deterministic fixture test (24.25ms)
✅ cohortService - day granularity (5.22ms)
✅ cohortService - month granularity (6.31ms)
✅ cohortService - active metric type (5.83ms)
✅ cohortService - getRetentionCurve for specific cohort (3.14ms)
✅ cohortService - recompute clears cache (6.69ms)
✅ cohortService - empty cohort handling (3.07ms)
✅ cohortService - throws error for non-existent cohort (25.12ms)

CI Checks

  • ✅ TypeScript type checking passing
  • ✅ Prettier formatting passing
  • ✅ All unit tests passing (128 total, 8 new)

🎯 Acceptance Criteria

A known fixture yields the expected cohort/retention curves

Implemented with deterministic test using hand-computed values. All retention rates match expected percentages with exact precision.

🔒 Edge Cases Handled

  1. UTC timezone/period boundaries - All timestamps normalized to UTC
  2. Small cohorts - System reports actual counts (no suppression)
  3. Re-computation after reconciliation - recompute flag supported
  4. ISO week numbering - Correct first-Thursday rule implementation
  5. Cache management - Explicit invalidation and recomputation

📊 Performance

  • Computation: O(N) where N = number of activities
  • Storage: O(C × P) where C = cohorts, P = max offset periods
  • Queries: Fast reads from precomputed cache
  • Recomputation: Can be triggered manually when needed

🔐 Security

  • All endpoints require API key authentication
  • Rate limiting applied to all cohort endpoints
  • SQL injection protected (parameterized queries)
  • Campaign ID validation prevents unauthorized access

📚 Documentation

Complete documentation in IMPLEMENTATION_ISSUE_623.md including:

  • API usage examples
  • Database schema details
  • Technical design rationale
  • Performance considerations
  • Future enhancement roadmap

🚀 Deployment

  1. Run migration: npm run db:migrate
  2. No new environment variables required
  3. Backward compatible (additive only)

*Ready for reviewpush origin feat/cohort-retention-api-623 This implementation provides production-ready cohort and retention analysis with deterministic, testable outputs and statistical rigor.

- Add database migration for user activities, cohort stats, and retention data
- Implement cohort repository with complete data access layer
- Create cohort service with deterministic retention calculations
- Add REST API endpoints for cohort analysis and retention curves
- Support multiple granularities (day, week, month) and metrics (claimed, active)
- Include comprehensive unit tests with hand-computed expected values
- All tests passing with deterministic fixture verification
- TypeScript type checking and prettier formatting passing
@vercel

vercel Bot commented Jun 18, 2026

Copy link
Copy Markdown

@williams1604 is attempting to deploy a commit to the joelpeace48-cell's projects Team on Vercel.

A member of the Team first needs to authorize it.

@vercel

vercel Bot commented Jun 18, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
trivela-frontend Ready Ready Preview, Comment Jun 18, 2026 11:38pm

@joelpeace48-cell joelpeace48-cell merged commit 48f665d into FinesseStudioLab:main Jun 18, 2026
10 of 11 checks passed
@grantfox-oss grantfox-oss Bot mentioned this pull request Jun 18, 2026
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: Cohort & retention analysis API

3 participants