CostPilot

LLM Cost and Latency Optimization Dashboard

An LLM usage analytics dashboard for cost, latency, routing, and optimization decisions.

Overview

CostPilot is a dashboard and middleware toolkit that tracks LLM usage, cost, latency, cache hit rate, model selection, and workflow-level spend. It serves as the financial control panel for AI systems.

Features

Real-time Cost Tracking: Monitor token usage and costs across all LLM providers
Latency Analytics: Track response times with p50, p95, p99 percentiles
Workflow Attribution: Attribute costs to specific workflows and features
Cache Hit Rate: Measure prompt caching effectiveness and savings
Model Comparison: Compare cost and performance across models
Expensive Prompt Detection: Identify costly prompts for optimization
Budget Alerts: Set spending thresholds and get notified

Architecture

┌─────────────┐     ┌──────────────┐     ┌────────────┐
│   SDK/MW     │────▶│   FastAPI     │────▶│ PostgreSQL │
│  (Python)    │     │   Server      │     │            │
└─────────────┘     └──────┬───────┘     └────────────┘
                           │
                    ┌──────▼───────┐
                    │   Next.js     │
                    │   Dashboard   │
                    └──────────────┘

Quick Start

Using Docker Compose

cp .env.example .env
docker-compose up -d

The dashboard will be available at http://localhost:3000 and the API at http://localhost:8000.

Manual Setup

Server

cd server
pip install -r requirements.txt
uvicorn app.main:app --reload

Dashboard

cd dashboard
npm install
npm run dev

SDK

cd sdk
pip install -e .

SDK Usage

Basic Usage

from costpilot import CostPilotClient

client = CostPilotClient(
    server_url="http://localhost:8000",
    api_key="your-api-key",
    project="my-project"
)

await client.log_usage({
    "model": "gpt-4o",
    "workflow": "summarization",
    "prompt_tokens": 1500,
    "completion_tokens": 500,
    "latency_ms": 1200,
    "cached": False
})

Decorator Usage

from costpilot.decorators import track_cost, track_llm_call

@track_cost(workflow="content-generation")
@track_llm_call(model="gpt-4o")
async def generate_content(prompt: str) -> str:
    response = await openai.chat.completions.create(...)
    return response.choices[0].message.content

ASGI Middleware

from costpilot.middleware import CostPilotMiddleware

app.add_middleware(CostPilotMiddleware, client=client)

Token Economics

CostPilot calculates costs based on provider pricing data:

Input tokens: Charged per 1K tokens at the input rate
Output tokens: Charged per 1K tokens at the output rate
Cache savings: Cached tokens are tracked separately for savings calculation
Workflow aggregation: Costs roll up to workflow and project levels

Pricing Configuration

Pricing data is loaded from YAML configuration files. See config/pricing.example.yaml for the format.

Supported providers:

OpenAI (GPT-4o, GPT-4o-mini, GPT-4 Turbo, GPT-3.5 Turbo)
Anthropic (Claude 3.5 Sonnet, Claude 3 Opus, Claude 3 Haiku)
Google (Gemini 1.5 Pro, Gemini 1.5 Flash)
Mistral (Mistral Large, Mistral Medium)

API Endpoints

Endpoint	Method	Description
`/api/v1/usage`	POST	Log a usage record
`/api/v1/usage/batch`	POST	Log batch usage records
`/api/v1/costs`	GET	Query cost data
`/api/v1/costs/by-workflow`	GET	Costs grouped by workflow
`/api/v1/costs/by-model`	GET	Costs grouped by model
`/api/v1/analytics/over-time`	GET	Cost trends over time
`/api/v1/analytics/expensive-prompts`	GET	Most expensive prompts
`/api/v1/analytics/optimization`	GET	Optimization suggestions
`/api/v1/health`	GET	Health check

Dashboard Pages

Home: Summary overview with key metrics and charts
Costs: Detailed cost breakdown by model, workflow, and time
Latency: Latency distribution and trends per model
Workflows: Workflow-level spend and performance metrics

Development

Running Tests

# SDK tests
cd sdk && pytest

# Server tests
cd server && pytest

# Dashboard build
cd dashboard && npm run build

Environment Variables

See .env.example for all configuration options.

Budget Alert Payloads

Budget thresholds can include webhook_url, warning_threshold_percent, and critical_threshold_percent. CostPilot records the last crossed threshold and exposes the alert payload from /api/v1/budget/status/{project}:

{
  "project": "my-project",
  "monthly_budget_usd": 1000.0,
  "current_spend_usd": 850.0,
  "percent_used": 85.0,
  "status": "warning",
  "threshold_percent": 80,
  "triggered_at": "2026-05-08T22:00:00Z"
}

Outbound webhook delivery requires an approved sender integration; the API stores webhook configuration and exposes the exact payload safely.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.commandcode/taste		.commandcode/taste
.github/workflows		.github/workflows
config		config
dashboard		dashboard
data/sample		data/sample
docs		docs
sdk		sdk
server		server
.editorconfig		.editorconfig
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CONTRIBUTING.md		CONTRIBUTING.md
IMPLEMENTATION_PLAN.md		IMPLEMENTATION_PLAN.md
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CostPilot

Overview

Features

Architecture

Quick Start

Using Docker Compose

Manual Setup

Server

Dashboard

SDK

SDK Usage

Basic Usage

Decorator Usage

ASGI Middleware

Token Economics

Pricing Configuration

API Endpoints

Dashboard Pages

Development

Running Tests

Environment Variables

Budget Alert Payloads

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CostPilot

Overview

Features

Architecture

Quick Start

Using Docker Compose

Manual Setup

Server

Dashboard

SDK

SDK Usage

Basic Usage

Decorator Usage

ASGI Middleware

Token Economics

Pricing Configuration

API Endpoints

Dashboard Pages

Development

Running Tests

Environment Variables

Budget Alert Payloads

License

About

Topics

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages