Skip to content

luckyman0026/lumen

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

9 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Lumen

Lumen is a comprehensive AI traffic analytics platform that tracks, classifies, and helps monetize AI bot traffic (GPTBot, ClaudeBot, PerplexityBot, etc.) across your web applications.

✨ Features

  • πŸ€– AI Bot Detection β€” Identifies 20+ AI crawlers including OpenAI, Anthropic, Google, Meta, and more
  • πŸ“Š Real-time Analytics β€” Live dashboard with 10-second polling for instant insights
  • πŸ”₯ Fire-and-forget SDK β€” Never blocks user requests, uses async event capture
  • πŸ’° Revenue Estimation β€” Calculate potential revenue from licensing AI training data
  • πŸ” Secure by Design β€” HMAC-SHA256 signed payloads with nonce protection
  • ⚑ High Throughput β€” Buffered batch inserts to ClickHouse for scalability
  • 🌐 Edge-ready β€” SDK works in Edge Runtime via Web Crypto API

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                      YOUR APPLICATION                             β”‚
β”‚            (Next.js with @lumen/lumen-nextjs SDK)               β”‚
β”‚                       proxy.ts middleware                         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                         β”‚ HMAC-signed events (fire-and-forget)
                         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                 LUMEN SERVER (Go)                             β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚ Ingest  β”‚ β†’ β”‚ Classifier β”‚ β†’ β”‚ Buffer β”‚ β†’ β”‚ ClickHouse      β”‚ β”‚
β”‚  β”‚   API   β”‚   β”‚ (20+ bots) β”‚   β”‚ (batch)β”‚   β”‚ (analytics DB)  β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                         β”‚ REST API queries
                         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚               LUMEN DASHBOARD (Next.js 16)                    β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚ Overview β”‚  β”‚ Time Series β”‚  β”‚Top Routes β”‚  β”‚  Top Bots    β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ“ Monorepo Structure

lumen/
β”œβ”€β”€ docker-compose.yml          # Production-ready Docker setup
β”œβ”€β”€ .env.example                # Environment configuration template
β”‚
β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ lumen-dashboard/ # Next.js 16 analytics dashboard
β”‚   β”‚   β”œβ”€β”€ app/                # App Router pages
β”‚   β”‚   β”œβ”€β”€ components/         # React components
β”‚   β”‚   β”œβ”€β”€ hooks/              # React Query hooks
β”‚   β”‚   β”œβ”€β”€ lib/                # Utilities & server actions
β”‚   β”‚   └── docs/               # Dashboard documentation
β”‚   β”‚
β”‚   └── lumen-server/    # Go backend collector
β”‚       β”œβ”€β”€ cmd/server/         # Entry point
β”‚       └── internal/           # Core modules
β”‚           β”œβ”€β”€ api/            # HTTP handlers
β”‚           β”œβ”€β”€ buffer/         # Event batching
β”‚           β”œβ”€β”€ classifier/     # AI bot detection
β”‚           β”œβ”€β”€ ingest/         # Event processing
β”‚           β”œβ”€β”€ models/         # Data structures
β”‚           └── storage/        # ClickHouse operations
β”‚
└── packages/
    └── lumen-sdk/       # Client SDK monorepo
        └── packages/
            β”œβ”€β”€ lumen-core/    # Framework-agnostic core
            └── lumen-nextjs/  # Next.js adapter

πŸš€ Quick Start

Prerequisites

  • Docker & Docker Compose
  • Node.js 20+ (for SDK development)
  • Go 1.22+ (for server development)
  • pnpm (for SDK/dashboard)

1. Clone & Start Services

git clone https://github.com/lumen-org/Lumen.git
cd Lumen

# Start all services (server, dashboard, clickhouse)
docker compose up -d

# View logs
docker compose logs -f

2. Access the Dashboard

3. Integrate the SDK

pnpm add @lumen/lumen-nextjs

Create proxy.ts in your Next.js app:

import { createLumen } from '@lumen/lumen-nextjs';
import { NextRequest, NextFetchEvent, NextResponse } from 'next/server';

const tracker = createLumen({
  ingestUrl: process.env.LUMEN_INGEST_URL!,
  keyId: process.env.LUMEN_KEY_ID!,
  hmacSecret: process.env.LUMEN_HMAC_SECRET!,
  sampleRate: 0.1, // 10% sampling
});

export function proxy(req: NextRequest, event: NextFetchEvent) {
  tracker.capture(req, event);
  return NextResponse.next();
}

export const config = {
  matcher: ['/((?!_next/static|_next/image|favicon.ico).*)'],
};

πŸ“¦ Components

πŸ–₯️ Lumen Dashboard

Real-time analytics dashboard built with Next.js 16.

Technology Purpose
Next.js 16 App Router framework
TypeScript Type safety
Tailwind CSS 4 Styling
shadcn/ui UI components
Recharts Data visualization
TanStack Query Data fetching & caching

Pages:

  • / β€” Overview with stats, traffic chart, top routes/bots
  • /time-series β€” Detailed traffic breakdown over time
  • /top-routes β€” Route-level analytics with revenue estimation
  • /top-bots β€” AI bot/operator traffic analysis

πŸ“š Dashboard Documentation


βš™οΈ Lumen Server

High-throughput Go backend for event collection and analytics.

Component Purpose
Chi Router HTTP routing
ClickHouse Analytics database
Event Buffer Batch processing
AI Classifier Bot detection

Detected AI Bots:

Vendor Bots
OpenAI GPTBot, ChatGPT-User, OAI-SearchBot
Anthropic ClaudeBot, Claude-Web
Google Google-Extended, Gemini
Perplexity PerplexityBot
Meta Meta-ExternalAgent, FacebookBot
Amazon Amazonbot, BedrockBot
Apple Applebot-Extended
ByteDance Bytespider
Common Crawl CCBot
Cohere CohereBot
DeepSeek DeepSeekBot
...and more

πŸ“š Server Documentation


πŸ“‘ Lumen SDK

Fire-and-forget client SDK for capturing request events.

@lumen/lumen-core   β€” Framework-agnostic core
@lumen/lumen-nextjs β€” Next.js 15/16 adapter

Features:

  • HMAC-SHA256 payload signing
  • Deterministic sampling (consistent across distributed systems)
  • Edge Runtime compatible (Web Crypto API)
  • Never blocks user requests

πŸ“š SDK Documentation


πŸ”Œ API Reference

Authentication

All endpoints require X-API-Key header:

curl -H "X-API-Key: your-token" http://localhost:8080/v1/overview

Endpoints

Method Endpoint Description
GET /health Health check
POST /v1/events Ingest event batch (1-5000 events)
GET /v1/overview Aggregate traffic stats
GET /v1/timeseries Time-bucketed traffic data
GET /v1/top-routes Route rankings by AI traffic
GET /v1/top-bots Bot rankings by request count
GET /v1/routes Available routes list
GET /v1/route-prices Saved route pricing
POST /v1/opportunity/estimate Revenue estimation

Query Parameters

Parameter Type Description
from ISO 8601 Start time filter
to ISO 8601 End time filter
route string Filter by route path

βš™οΈ Configuration

Environment Variables

Create a .env file from the template:

cp .env.example .env
Variable Default Description
INGEST_TOKEN your-secret-token-here API authentication token
CLICKHOUSE_HOST clickhouse ClickHouse hostname
CLICKHOUSE_PORT 9000 ClickHouse native port
CLICKHOUSE_DB default Database name
BUFFER_SIZE 1000 Events before flush
FLUSH_INTERVAL 1 Seconds between flushes
API_URL http://server:8080/v1 Dashboard API URL
API_KEY (same as INGEST_TOKEN) Dashboard API key

Generate Secure Token

openssl rand -base64 32

🐳 Docker Deployment

Local Development

# Start core services (server, dashboard, clickhouse)
docker compose up -d

# View status
docker compose ps

# View logs
docker compose logs -f server dashboard

Production (with Traefik)

# Start all services including Traefik reverse proxy
docker compose --profile production up -d

Production URLs (configure DNS first):

  • Dashboard: https://lumen.example.com
  • API: https://api.lumen.example.com

Stop Services

docker compose down

# Remove volumes too
docker compose down -v

πŸ—„οΈ Database Schema

ai_traffic_events (ClickHouse)

CREATE TABLE ai_traffic_events (
  ts DateTime64(3),              -- Event timestamp
  received_at DateTime64(3),     -- Server received time
  request_id UUID,               -- Unique request ID
  method String,                 -- HTTP method
  pathname String,               -- URL path
  route String,                  -- Normalized route
  ip String,                     -- Client IP
  user_agent String,             -- Raw User-Agent
  
  -- AI Classification
  is_ai UInt8,                   -- 1 if AI bot, 0 otherwise
  ai_vendor String,              -- "openai", "anthropic", etc.
  bot_name String,               -- "GPTBot", "ClaudeBot", etc.
  intent String,                 -- "training", "search", etc.
  confidence String              -- Classification confidence
)
ENGINE = MergeTree
PARTITION BY toDate(ts)
ORDER BY (project_id, route, ts)

πŸ§ͺ Development

Dashboard

cd app/lumen-dashboard
pnpm install
pnpm dev

Server

cd app/lumen-server
go run ./cmd/server

SDK

cd packages/lumen-sdk
pnpm install
pnpm build

πŸ“ˆ Revenue Estimation

Lumen helps estimate potential revenue from AI traffic monetization:

  1. Set Route Prices β€” Define $/1K requests per route
  2. View Estimates β€” See projected revenue at different pay-through rates
  3. Track Over Time β€” Monitor AI traffic trends

Example: If /api/products receives 100K AI requests/month at $5/1K:

  • Low estimate (10% pay-through): $50/month
  • Mid estimate (50% pay-through): $250/month
  • High estimate (100% pay-through): $500/month

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit changes (git commit -m 'Add amazing feature')
  4. Push to branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

πŸ™ Acknowledgments

  • ClickHouse β€” Lightning-fast analytics database
  • Next.js β€” React framework
  • shadcn/ui β€” Beautiful UI components
  • Recharts β€” Composable charting library
  • Chi β€” Lightweight Go router

Built with ❀️ for the AI era

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors