Skip to content

feat: Add contextual documentation retrieval tool#78

Open
mattpodwysocki wants to merge 3 commits intomainfrom
feature/contextual-docs-tool
Open

feat: Add contextual documentation retrieval tool#78
mattpodwysocki wants to merge 3 commits intomainfrom
feature/contextual-docs-tool

Conversation

@mattpodwysocki
Copy link
Contributor

@mattpodwysocki mattpodwysocki commented Feb 12, 2026

Summary

Implements get_contextual_docs_tool for intelligent documentation retrieval based on user context, code snippets, and error messages.

New in this iteration: Multi-stage crawling and curated high-value pages ensure specific, actionable results without requiring web search fallback.

Key Features

🎯 Curated Documentation Pages (NEW)

  • 50+ high-value pages organized by topic (markers, popups, layers, styling, geocoding, navigation, etc.)
  • Intelligent keyword matching handles singular/plural variations
  • Direct fetching of most relevant content for common queries
  • Easily expandable - add new pages as patterns emerge

🔍 Multi-Stage Crawling (NEW)

  1. Stage 1: Fetch documentation index (llms.txt)
  2. Stage 2: Fetch top pages from index
  3. Stage 3: Extract and follow links from those pages
  4. Stage 4: Parse HTML content and extract meaningful sections
  5. Stage 5: Score and rank all content by relevance

🧠 Context-Aware Analysis

  • Keyword extraction from user descriptions, code snippets, error messages
  • Code pattern recognition (API calls, methods, Mapbox-specific patterns)
  • Error analysis (identifies common error types and patterns)
  • Technology filtering (mapbox-gl-js, iOS SDK, Android SDK, etc.)

📊 Intelligent Relevance Scoring

  • Keyword matches in titles (highest weight)
  • Keyword matches in content
  • Technology-specific relevance
  • Error-related content
  • Curated pages get priority scoring

💡 Helpful Extras

  • Match explanations - Why each result is relevant
  • Troubleshooting tips - Actionable advice for errors
  • Related topics - Suggested concepts to explore
  • Performance - 1-hour caching for both index and HTML pages

Before & After

Before (Original Implementation)

Query: "adding markers with popups"
Result: 1 generic "Mapbox Documentation" result
Claude: "Let me search the web for more specific information..."

After (With Curated Pages)

Query: "adding markers with popups"
Results:
  1. Add a Marker Example (90% relevance)
  2. Marker API Reference (85% relevance)
  3. Popup Examples (80% relevance)
  4. Custom Marker Icons (75% relevance)
Claude: "Here's the relevant documentation..." ✅ No web search needed!

Use Cases

  1. Building features - "I'm trying to add custom markers with popups"
  2. Debugging errors - "Getting error: 'Style is not done loading'"
  3. Learning APIs - "Working with mapbox-gl-js to show user location"
  4. Troubleshooting - "How do I handle rate limiting errors?"
  5. Finding examples - "Show me how to add custom styled layers"

Implementation Details

Architecture

  • HTML Parsing: Uses linkedom for fast, lightweight DOM parsing
  • Curated Pages: Topic-based dictionary with singular/plural matching
  • Crawling Strategy: Fetch curated pages first, supplement with discovered content
  • Scoring Algorithm: Weighted scoring with match explanations

Curated Topics (Expandable)

  • Markers & Popups
  • Layers & Styling
  • Data Sources
  • Events & Interaction
  • Geocoding & Search
  • Navigation & Directions
  • Controls
  • Camera & Animation
  • 3D & Terrain
  • Expressions

Dependencies

  • linkedom (new) - HTML parsing and DOM manipulation
  • zod - Schema validation
  • All existing dependencies

Testing

13 comprehensive tests covering:

  • Basic functionality (context, code, errors, technology)
  • Relevance scoring and ranking
  • Suggestions generation
  • Error handling (HTTP errors, network errors)
  • Caching behavior (both index and HTML pages)
  • Output formatting

All 549 project tests passing

Example Usage

In MCP Inspector

{
  "context": "adding custom markers with popups to Mapbox GL JS map",
  "technology": "mapbox-gl-js",
  "limit": 5
}

In Claude Desktop (Natural Language)

"I'm trying to add custom markers with popups to my Mapbox GL JS map. Can you find relevant documentation?"

Result: Claude receives specific marker/popup documentation and provides direct guidance without web search.

Documentation

  • ✅ Updated CHANGELOG.md with feature details
  • ✅ Updated README.md with tool description and examples
  • ✅ Comprehensive inline code documentation
  • 📄 Created proposed improved llms.txt structure (in ~/Downloads)

Future Enhancements

  1. Improved llms.txt - Propose enhanced structure to docs team with more granular links
  2. Expand curated pages - Add more pages as usage patterns emerge
  3. Analytics - Track which queries need better coverage
  4. Multi-language support - Extend to iOS/Android-specific queries

Addresses

Closes #70

🤖 Generated with Claude Code

Implements get_contextual_docs_tool for intelligent documentation
retrieval based on user context, code snippets, and error messages.

Features:
- Context-aware keyword extraction from text, code, and errors
- Intelligent relevance scoring with match explanations
- Troubleshooting tips for error messages
- Technology-specific filtering (mapbox-gl-js, iOS SDK, Android SDK)
- Suggested related topics
- Ranked results with excerpts and direct documentation links
- 1-hour caching for performance

Smarter than simple search - understands full context and provides
actionable, targeted documentation guidance.

Addresses: #70

Changes:
- Add GetContextualDocsTool implementation
- Register in toolRegistry.ts
- Add comprehensive test coverage (13 tests)
- Update CHANGELOG.md with feature details
- Update README.md with tool documentation and examples
- All 549 tests passing

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@mattpodwysocki mattpodwysocki requested a review from a team as a code owner February 12, 2026 22:48
mattpodwysocki and others added 2 commits February 13, 2026 00:31
Improvements:
- Add curated high-value documentation pages for common topics (markers, popups, layers, etc.)
- Implement three-stage crawling: index → main pages → linked pages
- Add HTML parsing and content extraction from docs pages
- Implement singular/plural keyword matching for better curated page discovery
- Add linkedom dependency for HTML parsing

This allows the tool to return specific, actionable documentation for queries
like "adding markers with popups" instead of generic plugin pages. Claude Desktop
no longer needs to supplement with web search for common Mapbox questions.

All 549 tests passing.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Add Docusaurus-specific selectors (article, [id*="docs-content"], .markdown)
- Extract meta descriptions as fallback
- Add paragraph/list/code extraction when headings are sparse
- Support H4 headings
- Add content length limits (2000 chars for paragraph extraction)
- Better fallback chain: headings → paragraphs → meta description

This fixes content extraction from Mapbox example pages which use Docusaurus
and don't have many section headings. Now queries like "popup on hover" work
without needing web search fallback.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add contextual documentation retrieval tool

1 participant