Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
239 changes: 239 additions & 0 deletions LANGSMITH_SETUP.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,239 @@
# LangSmith Observability Setup

This document explains how to set up and use LangSmith for monitoring LLM and Replicate API calls in this project.

## Overview

LangSmith provides observability for:
- **OpenAI API calls** (GPT-4o, GPT-4o-mini) - scene generation, semantic augmentation, prompt parsing
- **Anthropic API calls** (Claude) - prompt parsing, creative direction
- **XAI API calls** (Grok-4) - AI-powered image pair selection, property scene analysis
- **Replicate API calls** - image generation (Flux), video generation (SkyReels, Veo3, Hailuo), audio generation
- **HTTP requests** - all FastAPI endpoints in backend and promptparser services

## Quick Start

### 1. Sign Up for LangSmith Cloud

1. Go to [smith.langchain.com](https://smith.langchain.com)
2. Sign up for a free account
3. Create a new project (e.g., "video-sim-poc")
4. Generate an API key from Settings → API Keys

### 2. Configure Environment Variables

Add the following to your `.env` file or environment:

```bash
# Enable LangSmith tracing
LANGCHAIN_TRACING_V2=true

# Your LangSmith API key (required)
LANGCHAIN_API_KEY=<your-api-key-here>

# Project name (optional, defaults to "video-sim-poc")
LANGCHAIN_PROJECT=video-sim-poc

# LangSmith API endpoint (optional, defaults to cloud)
LANGCHAIN_ENDPOINT=https://api.smith.langchain.com
```

For **promptparser** service, also add these to `promptparser/.env`:

```bash
LANGCHAIN_TRACING_V2=true
LANGCHAIN_API_KEY=<your-api-key-here>
LANGCHAIN_PROJECT=video-sim-poc
LANGCHAIN_ENDPOINT=https://api.smith.langchain.com
```

### 3. Install Dependencies

Dependencies are already added to `pyproject.toml`. Install them with:

```bash
# If using uv (recommended)
uv pip install -e .

# Or with pip
pip install -e .
```

### 4. Run the Application

Start your application as usual:

```bash
# Docker Compose
docker-compose up

# Or locally
uvicorn backend.main:app --reload
```

### 5. View Traces

1. Go to [smith.langchain.com](https://smith.langchain.com)
2. Navigate to your project
3. You'll see traces for all API calls automatically

## What Gets Traced

### LLM Calls

All OpenAI and Anthropic API calls are automatically traced with:
- **Input prompts** and system messages
- **Model** and parameters (temperature, max_tokens, etc.)
- **Responses** and token usage
- **Latency** and timing
- **Errors** with full stack traces

**Instrumented functions:**
- `backend/services/scene_generator.py`: `generate_scenes()`, `regenerate_scene()`
- `backend/llm_interpreter.py`: `augment_object()`, `augment_scene()`
- `backend/services/xai_client.py`: `select_image_pairs()`, `select_property_scene_pairs()`
- `promptparser/app/services/llm/openai_provider.py`: `complete()`, `analyze_image()`
- `promptparser/app/services/llm/claude_provider.py`: `complete()`, `analyze_image()`

### Replicate API Calls

All Replicate API calls are traced with:
- **Model** and version
- **Input parameters** (prompt, images, duration, etc.)
- **Prediction ID** and status
- **Polling** behavior and timing
- **Output URLs** and results
- **Errors** and failures

**Instrumented functions:**
- `backend/services/replicate_client.py`: `generate_image()`, `generate_video()`, `generate_video_from_pair()`, `poll_prediction()`

### HTTP Requests

All HTTP requests to your FastAPI endpoints are traced with:
- **Method** and path (e.g., `POST /api/scenes`)
- **Query parameters**
- **Nested LLM/Replicate calls** automatically grouped under the parent request
- **Response time** and status

**Middleware:**
- `backend/main.py`: `LangSmithMiddleware`
- `promptparser/app/main.py`: `LangSmithMiddleware`

## Viewing Traces

### Trace Hierarchy

Traces are organized hierarchically:

```
POST /api/scenes (HTTP request)
├─ generate_scenes (scene generation)
│ ├─ openai_generate_scenes (OpenAI API call)
│ │ └─ Input: prompt, model, temperature
│ │ └─ Output: scenes JSON, tokens used
│ └─ replicate_generate_image (Replicate API call)
│ └─ Input: prompt, model
│ └─ Output: image URL, prediction ID
└─ Response: 200 OK
```

### Filtering and Searching

Use LangSmith's UI to:
- **Filter by tags**: `openai`, `anthropic`, `xai`, `grok`, `replicate`, `http_request`, `scene_generation`, `image_selection`, etc.
- **Search by prompt**: Find specific scenes or prompts
- **Filter by status**: Find errors or slow requests
- **View costs**: Track spending per model
- **Compare runs**: See how prompts changed over time

## Disabling Tracing

To temporarily disable tracing:

```bash
# Set to false or remove the variable
LANGCHAIN_TRACING_V2=false
```

Or remove the `LANGCHAIN_API_KEY` environment variable.

## Docker Compose

The `docker-compose.yml` already includes all necessary environment variables with defaults:

```yaml
environment:
- LANGCHAIN_TRACING_V2=${LANGCHAIN_TRACING_V2:-false}
- LANGCHAIN_API_KEY=${LANGCHAIN_API_KEY}
- LANGCHAIN_PROJECT=${LANGCHAIN_PROJECT:-video-sim-poc}
- LANGCHAIN_ENDPOINT=${LANGCHAIN_ENDPOINT:-https://api.smith.langchain.com}
```

Just set `LANGCHAIN_TRACING_V2=true` and `LANGCHAIN_API_KEY=<key>` in your `.env` file.

## Troubleshooting

### No traces appearing

1. **Check environment variables**: Ensure `LANGCHAIN_TRACING_V2=true` and `LANGCHAIN_API_KEY` is set
2. **Check API key**: Verify the API key is valid in LangSmith settings
3. **Check project name**: Ensure the project exists in LangSmith
4. **Check logs**: Look for any LangSmith-related errors in application logs

### Traces are incomplete

- Ensure all services are using the same `LANGCHAIN_PROJECT` name
- Verify middleware is registered (check `backend/main.py` and `promptparser/app/main.py`)

### Performance concerns

- Tracing adds minimal overhead (~5-20ms per trace)
- For high-throughput production, consider:
- Using sampling (trace only % of requests)
- Disabling tracing for health checks (already implemented)
- Using LangSmith's batch mode

## Advanced Configuration

### Custom Tags

Add custom tags to traces for better organization:

```python
@traceable(name="my_function", tags=["custom_tag", "production"])
def my_function():
# Your code here
```

### Metadata

Add metadata to provide additional context:

```python
@traceable(
name="process_user_request",
metadata={
"user_id": user_id,
"campaign_id": campaign_id,
"version": "v2"
}
)
def process_user_request(user_id, campaign_id):
# Your code here
```

## Cost Tracking

LangSmith automatically tracks costs for:
- **OpenAI** models (based on token usage and pricing)
- **Anthropic** models (based on token usage and pricing)
- **Replicate** models (if pricing info is available)

View costs in the LangSmith dashboard under your project.

## Further Reading

- [LangSmith Documentation](https://docs.smith.langchain.com/)
- [LangSmith Python SDK](https://github.com/langchain-ai/langsmith-sdk)
- [Tracing Reference](https://docs.smith.langchain.com/tracing)
6 changes: 6 additions & 0 deletions backend/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,12 @@ class Settings(BaseSettings):
OPENROUTER_API_KEY: Optional[str] = None
XAI_API_KEY: Optional[str] = None # For Grok models

# LangSmith observability settings
LANGCHAIN_TRACING_V2: bool = False # Enable LangSmith tracing
LANGCHAIN_API_KEY: Optional[str] = None # LangSmith API key
LANGCHAIN_PROJECT: str = "video-sim-poc" # LangSmith project name
LANGCHAIN_ENDPOINT: str = "https://api.smith.langchain.com" # LangSmith API endpoint

# Storage settings
VIDEO_STORAGE_PATH: str = "./DATA/videos"

Expand Down
3 changes: 3 additions & 0 deletions backend/llm_interpreter.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
from typing import Dict, List, Optional
from openai import OpenAI
from pydantic import BaseModel
from langsmith import traceable


class GenesisProperties(BaseModel):
Expand Down Expand Up @@ -41,6 +42,7 @@ def __init__(self):
self.client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
self.model = "gpt-4o" # or gpt-4-turbo, gpt-4, gpt-3.5-turbo

@traceable(name="llm_augment_object", tags=["openai", "genesis", "semantic_augmentation"])
async def augment_object(
self,
shape: str,
Expand Down Expand Up @@ -199,6 +201,7 @@ def _parse_llm_response(self, response: str) -> GenesisProperties:
reasoning=f"Failed to parse LLM response: {e}"
)

@traceable(name="llm_augment_scene", tags=["openai", "genesis", "scene_augmentation"])
async def augment_scene(
self,
scene_objects,
Expand Down
34 changes: 34 additions & 0 deletions backend/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,8 @@
import asyncio
from dotenv import load_dotenv
from pathlib import Path
from langsmith import traceable
from starlette.middleware.base import BaseHTTPMiddleware

# Import Asset Pydantic models
from .schemas.assets import (
Expand Down Expand Up @@ -335,6 +337,38 @@ def validate_file_type_with_magic_bytes(
)


# LangSmith tracing middleware
class LangSmithMiddleware(BaseHTTPMiddleware):
"""Middleware to trace HTTP requests with LangSmith"""

async def dispatch(self, request: Request, call_next):
# Get settings to check if tracing is enabled
settings = get_settings()

# Skip tracing if disabled or for static files/health checks
if not settings.LANGCHAIN_TRACING_V2 or request.url.path in ["/health", "/docs", "/openapi.json", "/redoc"] or request.url.path.startswith("/assets"):
return await call_next(request)

# Create traced function for the request
@traceable(
name=f"{request.method} {request.url.path}",
tags=["http_request", "backend", request.method.lower()],
metadata={
"method": request.method,
"path": request.url.path,
"query_params": dict(request.query_params),
}
)
async def process_request():
response = await call_next(request)
return response

return await process_request()


app.add_middleware(LangSmithMiddleware)


# Add rate limiting
@app.exception_handler(RateLimitExceeded)
async def rate_limit_handler(request, exc):
Expand Down
5 changes: 5 additions & 0 deletions backend/services/replicate_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
from os import environ
from typing import Dict, List, Optional, Any
import requests
from langsmith import traceable

# Configure logging
logger = logging.getLogger(__name__)
Expand Down Expand Up @@ -84,6 +85,7 @@ def __init__(self, api_key: Optional[str] = None):

logger.info("ReplicateClient initialized successfully")

@traceable(name="replicate_generate_image", tags=["replicate", "image_generation", "flux"])
def generate_image(
self,
prompt: str,
Expand Down Expand Up @@ -187,6 +189,7 @@ def generate_image(
"prediction_id": None
}

@traceable(name="replicate_generate_video", tags=["replicate", "video_generation", "skyreels"])
def generate_video(
self,
image_urls: List[str],
Expand Down Expand Up @@ -313,6 +316,7 @@ def generate_video(
"duration_seconds": 0
}

@traceable(name="replicate_poll_prediction", tags=["replicate", "polling", "status"])
def poll_prediction(
self,
prediction_id: str,
Expand Down Expand Up @@ -483,6 +487,7 @@ def estimate_cost(self, num_images: int, video_duration: int) -> float:

return total_cost

@traceable(name="replicate_generate_video_from_pair", tags=["replicate", "video_generation", "image_to_video"])
def generate_video_from_pair(
self,
image1_url: str,
Expand Down
Loading