Skip to content

Latest commit

 

History

History
257 lines (196 loc) · 13.1 KB

File metadata and controls

257 lines (196 loc) · 13.1 KB

CLAUDE.md — PodGraph

What is this project?

PodGraph is a web app that ingests podcast interviews, transcribes them, and builds AI-generated profiles of people entirely from their own words. It connects people through shared ideas, mutual references, and overlapping interests — surfaced as inline contextual cards on person pages, not as a standalone graph visualization.

The person page is the product. It is a comprehensive research tool where aggregated opinions, detailed anecdotes, cross-episode synthesis, and conviction-ranked positions live. There are no episode pages — episodes are internal data units that feed into person profiles.

A working example of a single-episode guest extraction output is in kevin-rose.html (prototype-era static HTML, now retired in favor of the Next.js person page).

Tech stack

  • Frontend: Next.js 14+ (App Router), TypeScript, Tailwind CSS, shadcn/ui
  • Backend API: Next.js API Routes + tRPC (type-safe, co-located with frontend)
  • Database: PostgreSQL (Supabase or Neon) with Prisma ORM
  • Job queue: BullMQ + Redis
  • Transcription: Deepgram API (nova-3 with speaker diarization)
  • AI extraction & aggregation: Anthropic Claude API (Sonnet 4.5)
  • Auth: NextAuth.js or Clerk
  • Deployment: Vercel (frontend) + Railway/Fly.io (workers)

What is NOT in scope

  • No pgvector / embeddings — RAG chat is deferred to a future phase
  • No TranscriptChunk table — no vector search infrastructure
  • No D3.js / vis.js graph viz — connections are inline contextual cards

Project structure

Current: Pipeline Scripts (Working)

podgraph/
├── scripts/
│   ├── pipeline.ts                    # Full pipeline: transcribe → correct → identify → extract → update-registry
│   ├── transcribe.ts                  # Deepgram nova-3 with diarization
│   ├── correct-transcript.ts          # Claude-powered proper noun correction
│   ├── identify-speakers.ts           # Claude maps speaker labels to names
│   ├── extract.ts                     # Structured theme-based extraction (Zod validated)
│   ├── update-registry.ts             # Merge entities into global registry
│   ├── anthropic-cost.ts              # Cost estimation utility
│   ├── youtube-meta.ts                # YouTube metadata & yt-dlp integration
│   ├── test-keys.ts                   # Validate API keys
│   ├── lex/
│   │   ├── pipeline.ts                # Fast path for Lex Fridman transcripts
│   │   └── fetch-transcript.ts        # Scrape lexfridman.com transcripts
│   └── lib/
│       ├── dirs.ts                    # Directory resolution & slug generation
│       ├── manifest.ts                # Episode manifest management
│       └── schemas.ts                 # Zod schemas for extraction output
├── prompts/
│   ├── extraction.txt                 # Theme extraction prompt
│   ├── speaker-id.txt                 # Speaker identification prompt
│   └── correct-names.txt             # Proper noun correction prompt
├── data/
│   ├── episodes/                      # Per-episode data directories
│   ├── entities.json                  # Global entity registry
│   ├── corrections-global.json        # Accumulated name corrections
│   └── manifest.json                  # Processed episodes index
└── package.json

Target: Full Application

podgraph/
├── prisma/schema.prisma
├── src/
│   ├── app/
│   │   ├── person/[slug]/page.tsx     # Person profile (THE primary surface)
│   │   ├── podcast/[slug]/page.tsx    # Podcast info + episode list
│   │   ├── explore/                   # Category browsing
│   │   ├── search/page.tsx            # Full-text search
│   │   └── admin/                     # Episode ingestion, pipeline monitoring
│   ├── components/
│   │   ├── ui/                        # shadcn/ui
│   │   ├── person/                    # Profile sections, connection cards
│   │   └── layout/                    # App shell, navigation
│   ├── lib/
│   │   ├── db.ts                      # Prisma client
│   │   ├── ai/
│   │   │   ├── extraction.ts          # Theme extraction
│   │   │   ├── correction.ts          # Proper noun correction
│   │   │   ├── summarization.ts       # Worldview synthesis
│   │   │   └── aggregation.ts         # Conviction ranking, taste clustering
│   │   └── pipeline/
│   │       ├── ingestion.ts           # Audio download, episode creation
│   │       ├── transcription.ts       # Deepgram integration
│   │       ├── speaker-id.ts          # Speaker identification
│   │       ├── aggregation.ts         # Person page data aggregation
│   │       ├── connections.ts         # Connection scoring engine
│   │       └── registry.ts            # Entity registry management
│   └── workers/
│       ├── transcription.worker.ts
│       ├── extraction.worker.ts
│       └── aggregation.worker.ts
├── scripts/                           # Standalone CLI scripts (kept working)
├── prompts/                           # AI prompt templates
└── data/                              # Local pipeline data

Retired from Prototype

  • scripts/build-page.ts — Static HTML episode page generator. Replaced by Next.js person page.
  • output/ — Generated HTML directory. No longer needed.

The cardinal rule: first-party data only

Every piece of information on a Person profile comes exclusively from that person's own words in transcribed podcast interviews. No Wikipedia, no LinkedIn, no external bios. The only exception is the person's name and basic deduplication metadata. This constraint is non-negotiable.

Core data models

Extraction Schema (Validated in Prototype)

Primary unit is the Theme, with quotes/opinions/anecdotes nested within:

  • Theme: name, depth (mentioned/discussed/deep_dive), description, best_quote, opinion, anecdote, related_people, related_companies
  • Speaker Data: role, self_description, themes, books, movies_tv, music, games, tools_products, companies_orgs (with slug), people_mentioned (with slug)

Database Entities (Phase 1)

Six main entities: Podcast, Episode (internal, no public route), Person, PersonConnection, EntityRegistry

Key Person fields:

  • worldview_summary — AI-generated narrative of beliefs and recurring themes (not a biography)
  • convictions — Ranked positions with conviction strength, evolution timestamps
  • deep_on_topics — 2-4 most specific deep-dive topics across multiple appearances
  • taste_clusters — Recommendations clustered by pattern
  • themes_aggregated — Merged themes across all appearances

Key design decisions:

  • Episodes have NO public route. All content surfaces through person pages.
  • PersonConnection powers inline contextual cards, not a graph visualization.
  • EntityRegistry tracks canonical names, aliases, and cross-episode references.

AI processing pipeline

Episode Pipeline (5 stages)

Each stage is independent and retriable:

  1. Transcriptiontranscribe.ts: Deepgram nova-3, speaker diarization, utterance grouping
  2. Transcript correctioncorrect-transcript.ts: Global corrections file + Claude for remaining proper noun errors
  3. Speaker identificationidentify-speakers.ts: Claude maps speaker labels to real names with confidence scores
  4. AI extractionextract.ts: Structured theme-based extraction, Zod validated, chunked for long transcripts
  5. Entity registry updateupdate-registry.ts: Merge people/companies into global registry with alias detection

Person Page Aggregation Pipeline (Phase 2)

When a new episode is processed, runs for each identified speaker:

  1. Conviction extraction — Extract positions, rank by strength, detect evolution over time
  2. Worldview synthesis — Generate narrative from convictions and deep-dive themes
  3. Deep-on identification — Find topics with deep-dive depth across 2+ appearances
  4. Taste clustering — Cluster recommendations by thematic pattern using Claude
  5. Connection card generation — Format high-scoring connections into inline card types

Connection scoring

Weighted sum, normalized to 0–1. Prioritizes specificity over surface-level overlap:

Signal Score
Co-appearance on same episode +0.5
Mutual mention (both directions) +0.4
Shared specific book recommendation +0.3 per book
Opposing positions on same specific topic +0.3 per topic
Shared movie/music recommendation +0.2 per item
One-way mention (especially praise of lesser-known) +0.15
Shared specific topic (not generic) +0.1 per topic

"Both mention coffee" is not a connection. "Both recommend the same specific book" is.

Person page design

The person page sections (see roadmap Section 3 for full detail):

  1. Header — Name, aggregated self-described roles, appearance count
  2. Worldview Summary — 2-3 paragraph narrative (NOT a biography, NOT "X is a...")
  3. Convictions & Positions — Ranked by conviction strength, with evolution tracking
  4. Contextual Connections (Inline) — "Also spoke about", "Disagrees on", "Recommended by"
  5. Taste Profile — Recommendations clustered by pattern, not flat lists
  6. "Deep On" Badges — 2-4 most specific deep-dive topics near header
  7. People They Mention — Aggregated with context and sentiment
  8. Podcast Appearances — Chronological list with 2-3 strongest themes per appearance

What is NOT on the person page

  • No network graph visualization
  • No external biographical data
  • No episode pages (episodes are data sources, not destinations)
  • No AI chat (deferred)

Frontend pages

Route Page
/ Home / Explore — Featured people, trending topics, search
/person/[slug] Person Profile — Full aggregated profile with inline connections
/podcast/[slug] Podcast Page — Info, processed episodes, host profile link
/explore/[category] Category Browse — By occupation, interest, hobby
/search Search — Full-text across people, topics, quotes

Routes that do NOT exist: /episode/[id], /person/[slug]/chat, /connections

Implementation phases

Phase 0: Pipeline Scripts — COMPLETE

Core extraction pipeline validated. Five stages working. Static HTML generator (build-page.ts) retired.

Phase 1: Foundation

Next.js scaffolding, database schema (Prisma), migrate pipeline to app modules, manual episode ingestion admin form, basic person list page, basic search.

Phase 2: Person Page MVP

Person page aggregation pipeline, worldview summary, conviction extraction & ranking, full profile page with all sections, deep-on badges, taste clustering, contextual connection cards.

Phase 3: Discovery & Polish

Category browsing, home/explore page, podcast page, responsive design, search enhancements.

Phase 4: Scale & Automation

RSS auto-ingestion, admin dashboard, auth + user accounts, performance optimization.

Code conventions

  • TypeScript strictly throughout. No any types.
  • Prisma for all database operations.
  • BullMQ for all background jobs. Workers run in separate processes.
  • Server components by default in Next.js. Client components only when interactivity is needed.
  • All AI prompts stored as template strings in dedicated files under prompts/.
  • Shared Zod schemas in scripts/lib/schemas.ts (later src/lib/schemas.ts) — never duplicate schema definitions.
  • AI extraction prompts request structured JSON. Validate with Zod at runtime. Retry with schema feedback on failure.

Environment variables

See .env.example. Currently required: DEEPGRAM_API_KEY, ANTHROPIC_API_KEY. Phase 1 will add: DATABASE_URL, REDIS_URL, NEXTAUTH_SECRET/NEXTAUTH_URL.

Reference files

  • podgraph-roadmap-revised.md — Full architecture and implementation guide (the source of truth)
  • kevin-rose.html — Example of prototype-era static guest extraction output
  • TODO.md — Current task priorities

TODO list

Check off items in TODO.md as they are completed.

Case study (CASE_STUDY.md)

Maintain a portfolio case study about PodGraph at CASE_STUDY.md. Update it as the project evolves.

Structure:

  1. Problem — Why podcast content is hard to search/reference, and what I wanted to build
  2. Architecture decisions — Deepgram Nova-3 for transcription with speaker diarization, Claude API for structured extraction, the data pipeline design
  3. Tradeoffs — What I considered and rejected, why I chose these tools over alternatives
  4. Challenges — What was harder than expected and how I worked through it
  5. Current state / what's next

Tone: Write like explaining this to a senior engineer over coffee. No tutorial voice, no fluff. Be specific about technical choices and reasoning. Keep it to ~800 words.