pyrex41 · pyrex41 · Nov 22, 2025
diff --git a/LANGSMITH_SETUP.md b/LANGSMITH_SETUP.md
@@ -0,0 +1,239 @@
+# LangSmith Observability Setup
+
+This document explains how to set up and use LangSmith for monitoring LLM and Replicate API calls in this project.
+
+## Overview
+
+LangSmith provides observability for:
+- **OpenAI API calls** (GPT-4o, GPT-4o-mini) - scene generation, semantic augmentation, prompt parsing
+- **Anthropic API calls** (Claude) - prompt parsing, creative direction
+- **XAI API calls** (Grok-4) - AI-powered image pair selection, property scene analysis
+- **Replicate API calls** - image generation (Flux), video generation (SkyReels, Veo3, Hailuo), audio generation
+- **HTTP requests** - all FastAPI endpoints in backend and promptparser services
+
+## Quick Start
+
+### 1. Sign Up for LangSmith Cloud
+
+1. Go to [smith.langchain.com](https://smith.langchain.com)
+2. Sign up for a free account
+3. Create a new project (e.g., "video-sim-poc")
+4. Generate an API key from Settings → API Keys
+
+### 2. Configure Environment Variables
+
+Add the following to your `.env` file or environment:
+
+```bash
+# Enable LangSmith tracing
+LANGCHAIN_TRACING_V2=true
+
+# Your LangSmith API key (required)
+LANGCHAIN_API_KEY=<your-api-key-here>
+
+# Project name (optional, defaults to "video-sim-poc")
+LANGCHAIN_PROJECT=video-sim-poc
+
+# LangSmith API endpoint (optional, defaults to cloud)
+LANGCHAIN_ENDPOINT=https://api.smith.langchain.com
+```
+
+For **promptparser** service, also add these to `promptparser/.env`:
+
+```bash
+LANGCHAIN_TRACING_V2=true
+LANGCHAIN_API_KEY=<your-api-key-here>
+LANGCHAIN_PROJECT=video-sim-poc
+LANGCHAIN_ENDPOINT=https://api.smith.langchain.com
+```
+
+### 3. Install Dependencies
+
+Dependencies are already added to `pyproject.toml`. Install them with:
+
+```bash
+# If using uv (recommended)
+uv pip install -e .
+
+# Or with pip
+pip install -e .
+```
+
+### 4. Run the Application
+
+Start your application as usual:
+
+```bash
+# Docker Compose
+docker-compose up
+
+# Or locally
+uvicorn backend.main:app --reload
+```
+
+### 5. View Traces
+
+1. Go to [smith.langchain.com](https://smith.langchain.com)
+2. Navigate to your project
+3. You'll see traces for all API calls automatically
+
+## What Gets Traced
+
+### LLM Calls
+
+All OpenAI and Anthropic API calls are automatically traced with:
+- **Input prompts** and system messages
+- **Model** and parameters (temperature, max_tokens, etc.)
+- **Responses** and token usage
+- **Latency** and timing
+- **Errors** with full stack traces
+
+**Instrumented functions:**
+- `backend/services/scene_generator.py`: `generate_scenes()`, `regenerate_scene()`
+- `backend/llm_interpreter.py`: `augment_object()`, `augment_scene()`
+- `backend/services/xai_client.py`: `select_image_pairs()`, `select_property_scene_pairs()`
+- `promptparser/app/services/llm/openai_provider.py`: `complete()`, `analyze_image()`
+- `promptparser/app/services/llm/claude_provider.py`: `complete()`, `analyze_image()`
+
+### Replicate API Calls
+
+All Replicate API calls are traced with:
+- **Model** and version
+- **Input parameters** (prompt, images, duration, etc.)
+- **Prediction ID** and status
+- **Polling** behavior and timing
+- **Output URLs** and results
+- **Errors** and failures
+
+**Instrumented functions:**
+- `backend/services/replicate_client.py`: `generate_image()`, `generate_video()`, `generate_video_from_pair()`, `poll_prediction()`
+
+### HTTP Requests
+
+All HTTP requests to your FastAPI endpoints are traced with:
+- **Method** and path (e.g., `POST /api/scenes`)
+- **Query parameters**
+- **Nested LLM/Replicate calls** automatically grouped under the parent request
+- **Response time** and status
+
+**Middleware:**
+- `backend/main.py`: `LangSmithMiddleware`
+- `promptparser/app/main.py`: `LangSmithMiddleware`
+
+## Viewing Traces
+
+### Trace Hierarchy
+
+Traces are organized hierarchically:
+
+```
+POST /api/scenes (HTTP request)
+├─ generate_scenes (scene generation)
+│  ├─ openai_generate_scenes (OpenAI API call)
+│  │  └─ Input: prompt, model, temperature
+│  │  └─ Output: scenes JSON, tokens used
+│  └─ replicate_generate_image (Replicate API call)
+│     └─ Input: prompt, model
+│     └─ Output: image URL, prediction ID
+└─ Response: 200 OK
+```
+
+### Filtering and Searching
+
+Use LangSmith's UI to:
+- **Filter by tags**: `openai`, `anthropic`, `xai`, `grok`, `replicate`, `http_request`, `scene_generation`, `image_selection`, etc.
+- **Search by prompt**: Find specific scenes or prompts
+- **Filter by status**: Find errors or slow requests
+- **View costs**: Track spending per model
+- **Compare runs**: See how prompts changed over time
+
+## Disabling Tracing
+
+To temporarily disable tracing:
+
+```bash
+# Set to false or remove the variable
+LANGCHAIN_TRACING_V2=false
+```
+
+Or remove the `LANGCHAIN_API_KEY` environment variable.
+
+## Docker Compose
+
+The `docker-compose.yml` already includes all necessary environment variables with defaults:
+
+```yaml
+environment:
+  - LANGCHAIN_TRACING_V2=${LANGCHAIN_TRACING_V2:-false}
+  - LANGCHAIN_API_KEY=${LANGCHAIN_API_KEY}
+  - LANGCHAIN_PROJECT=${LANGCHAIN_PROJECT:-video-sim-poc}
+  - LANGCHAIN_ENDPOINT=${LANGCHAIN_ENDPOINT:-https://api.smith.langchain.com}
+```
+
+Just set `LANGCHAIN_TRACING_V2=true` and `LANGCHAIN_API_KEY=<key>` in your `.env` file.
+
+## Troubleshooting
+
+### No traces appearing
+
+1. **Check environment variables**: Ensure `LANGCHAIN_TRACING_V2=true` and `LANGCHAIN_API_KEY` is set
+2. **Check API key**: Verify the API key is valid in LangSmith settings
+3. **Check project name**: Ensure the project exists in LangSmith
+4. **Check logs**: Look for any LangSmith-related errors in application logs
+
+### Traces are incomplete
+
+- Ensure all services are using the same `LANGCHAIN_PROJECT` name
+- Verify middleware is registered (check `backend/main.py` and `promptparser/app/main.py`)
+
+### Performance concerns
+
+- Tracing adds minimal overhead (~5-20ms per trace)
+- For high-throughput production, consider:
+  - Using sampling (trace only % of requests)
+  - Disabling tracing for health checks (already implemented)
+  - Using LangSmith's batch mode
+
+## Advanced Configuration
+
+### Custom Tags
+
+Add custom tags to traces for better organization:
+
+```python
+@traceable(name="my_function", tags=["custom_tag", "production"])
+def my_function():
+    # Your code here
+```
+
+### Metadata
+
+Add metadata to provide additional context:
+
+```python
+@traceable(
+    name="process_user_request",
+    metadata={
+        "user_id": user_id,
+        "campaign_id": campaign_id,
+        "version": "v2"
+    }
+)
+def process_user_request(user_id, campaign_id):
+    # Your code here
+```
+
+## Cost Tracking
+
+LangSmith automatically tracks costs for:
+- **OpenAI** models (based on token usage and pricing)
+- **Anthropic** models (based on token usage and pricing)
+- **Replicate** models (if pricing info is available)
+
+View costs in the LangSmith dashboard under your project.
+
+## Further Reading
+
+- [LangSmith Documentation](https://docs.smith.langchain.com/)
+- [LangSmith Python SDK](https://github.com/langchain-ai/langsmith-sdk)
+- [Tracing Reference](https://docs.smith.langchain.com/tracing)
diff --git a/backend/config.py b/backend/config.py
@@ -22,6 +22,12 @@ class Settings(BaseSettings):
     OPENROUTER_API_KEY: Optional[str] = None
     XAI_API_KEY: Optional[str] = None  # For Grok models
 
+    # LangSmith observability settings
+    LANGCHAIN_TRACING_V2: bool = False  # Enable LangSmith tracing
+    LANGCHAIN_API_KEY: Optional[str] = None  # LangSmith API key
+    LANGCHAIN_PROJECT: str = "video-sim-poc"  # LangSmith project name
+    LANGCHAIN_ENDPOINT: str = "https://api.smith.langchain.com"  # LangSmith API endpoint
+
     # Storage settings
     VIDEO_STORAGE_PATH: str = "./DATA/videos"
 

diff --git a/backend/llm_interpreter.py b/backend/llm_interpreter.py
@@ -10,6 +10,7 @@
 from typing import Dict, List, Optional
 from openai import OpenAI
 from pydantic import BaseModel
+from langsmith import traceable
 
 
 class GenesisProperties(BaseModel):
@@ -41,6 +42,7 @@ def __init__(self):
         self.client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
         self.model = "gpt-4o"  # or gpt-4-turbo, gpt-4, gpt-3.5-turbo
 
+    @traceable(name="llm_augment_object", tags=["openai", "genesis", "semantic_augmentation"])
     async def augment_object(
         self,
         shape: str,
@@ -199,6 +201,7 @@ def _parse_llm_response(self, response: str) -> GenesisProperties:
                 reasoning=f"Failed to parse LLM response: {e}"
             )
 
+    @traceable(name="llm_augment_scene", tags=["openai", "genesis", "scene_augmentation"])
     async def augment_scene(
         self,
         scene_objects,

diff --git a/backend/main.py b/backend/main.py
@@ -26,6 +26,8 @@
 import asyncio
 from dotenv import load_dotenv
 from pathlib import Path
+from langsmith import traceable
+from starlette.middleware.base import BaseHTTPMiddleware
 
 # Import Asset Pydantic models
 from .schemas.assets import (
@@ -335,6 +337,38 @@ def validate_file_type_with_magic_bytes(
 )
 
 
+# LangSmith tracing middleware
+class LangSmithMiddleware(BaseHTTPMiddleware):
+    """Middleware to trace HTTP requests with LangSmith"""
+
+    async def dispatch(self, request: Request, call_next):
+        # Get settings to check if tracing is enabled
+        settings = get_settings()
+
+        # Skip tracing if disabled or for static files/health checks
+        if not settings.LANGCHAIN_TRACING_V2 or request.url.path in ["/health", "/docs", "/openapi.json", "/redoc"] or request.url.path.startswith("/assets"):
+            return await call_next(request)
+
+        # Create traced function for the request
+        @traceable(
+            name=f"{request.method} {request.url.path}",
+            tags=["http_request", "backend", request.method.lower()],
+            metadata={
+                "method": request.method,
+                "path": request.url.path,
+                "query_params": dict(request.query_params),
+            }
+        )
+        async def process_request():
+            response = await call_next(request)
+            return response
+
+        return await process_request()
+
+
+app.add_middleware(LangSmithMiddleware)
+
+
 # Add rate limiting
 @app.exception_handler(RateLimitExceeded)
 async def rate_limit_handler(request, exc):

diff --git a/backend/services/replicate_client.py b/backend/services/replicate_client.py
@@ -10,6 +10,7 @@
 from os import environ
 from typing import Dict, List, Optional, Any
 import requests
+from langsmith import traceable
 
 # Configure logging
 logger = logging.getLogger(__name__)
@@ -84,6 +85,7 @@ def __init__(self, api_key: Optional[str] = None):
 
         logger.info("ReplicateClient initialized successfully")
 
+    @traceable(name="replicate_generate_image", tags=["replicate", "image_generation", "flux"])
     def generate_image(
         self,
         prompt: str,
@@ -187,6 +189,7 @@ def generate_image(
                 "prediction_id": None
             }
 
+    @traceable(name="replicate_generate_video", tags=["replicate", "video_generation", "skyreels"])
     def generate_video(
         self,
         image_urls: List[str],
@@ -313,6 +316,7 @@ def generate_video(
                 "duration_seconds": 0
             }
 
+    @traceable(name="replicate_poll_prediction", tags=["replicate", "polling", "status"])
     def poll_prediction(
         self,
         prediction_id: str,
@@ -483,6 +487,7 @@ def estimate_cost(self, num_images: int, video_duration: int) -> float:
 
         return total_cost
 
+    @traceable(name="replicate_generate_video_from_pair", tags=["replicate", "video_generation", "image_to_video"])
     def generate_video_from_pair(
         self,
         image1_url: str,