Unified web search with DDG, SearXNG, Wikipedia, and LLM research.
Dual-mode: Works as both an MCP Server (for AI Agents) and a CLI Tool (for humans).
| Tool | MCP Name | CLI Command | Description |
|---|---|---|---|
| DuckDuckGo Search | search_web_ddg |
searweb ddg |
Web search via DDG HTML interface |
| SearXNG Search | search_web_searxng |
searweb xng |
Metasearch via SearXNG (auto-managed Docker) |
| Web Fetch | fetch_web_markdown |
searweb fetch |
Fetch webpages as clean markdown with rule engine |
| Wikipedia | search_wikipedia |
searweb wiki |
Search Wikipedia articles |
| AI Research | llm_research |
searweb research |
LLM-powered autonomous research with synthesis & citations |
Requires Node.js >= 20 and npm.
npm install -g searweb
# or use without installing
npx searwebTo verify installation:
searweb --version # Should print the version
searweb --help # List available commands# Search DuckDuckGo
searweb ddg "TypeScript best practices" --limit 5
# Search SearXNG (auto-starts container if needed)
searweb xng "Rust programming" --limit 10
searweb xng --status # Check engine health
# Fetch a webpage
searweb fetch https://example.com
# Search Wikipedia
searweb wiki "Model Context Protocol" --lang en
# AI-powered research with synthesis & citations
searweb research "What is MCP and how does it work?" --level standard
# Continue previous research session
searweb research "Tell me more about MCP tools" --session <id>
searweb research --list # List saved sessions
searweb research --rm <id> -y # Delete a session
# Interactive configuration
searweb configWhen called without arguments, searweb runs as an MCP server via stdio:
searweb # stdio mode (default)
searweb server # explicit stdio mode
searweb server config.json # with config file pathSSE Mode (for remote access or multiple clients):
SEARWEB_TRANSPORT=sse SEARWEB_SSE_PORT=3000 searweb
# or in config.json: { "transport": "sse", "ssePort": 3000 }If you are an AI assistant reading this documentation and your user wants you to use searweb as an MCP, follow these steps:
Ensure the user's system has:
- Node.js >= 20 (
node --version) - npm (
npm --version) - Docker (optional, only if using SearXNG auto-start)
If Node.js is missing, direct the user to install it from nodejs.org.
npm install -g searwebVerify:
searweb --versionAsk the user for:
- OPENAI_API_KEY — Required for the
llm_researchtool. Without this, only basic search/fetch tools are available. - JINA_API_KEYS (optional) — Improves
fetch_web_markdownreliability. Can be omitted.
If the user does not have an OpenAI API key, research functionality will not be available.
Choose the configuration format for your MCP client:
Edit claude_desktop_config.json (location varies by OS):
{
"mcpServers": {
"searweb": {
"command": "npx",
"args": ["-y", "searweb"],
"env": {
"OPENAI_API_KEY": "<ask-user>",
"OPENAI_MODEL": "gpt-4o-mini",
"SEARXNG_AUTO_START": "true",
"JINA_API_KEYS": "<optional>",
"SEARWEB_EXPOSE_UNAVAILABLE_TOOLS": "true"
}
}
}
}Tip: Set
SEARWEB_EXPOSE_UNAVAILABLE_TOOLS=trueso thatllm_researchandsearch_web_searxngalways appear in the tool list, even when not configured. This lets the AI see the tools and guide the user to configure them, instead of silently hiding them. Calls to unavailable tools return clear setup instructions.
Edit ~/.config/opencode/opencode.json or opencode.jsonc.
If your OpenCode version supports the environment field:
{
"mcp": {
"searweb": {
"type": "local",
"command": ["npx", "-y", "searweb"],
"enabled": true,
"environment": {
"OPENAI_API_KEY": "<ask-user>",
"OPENAI_MODEL": "gpt-4o-mini",
"SEARXNG_AUTO_START": "true",
"SEARWEB_EXPOSE_UNAVAILABLE_TOOLS": "true"
},
"timeout": 30000
}
}
}If your OpenCode version does not support environment / env, or if tools fail to appear, use the bundled wrapper script (recommended for OpenCode on Windows):
{
"mcp": {
"searweb": {
"type": "local",
"command": [
"node",
"E:\\Epheia\\dev\\apps\\tool-apps\\searweb\\scripts\\start-with-env.js"
],
"enabled": true,
"timeout": 30000
}
}
}The wrapper script:
- Reads
.envfrom the project root automatically - Defaults
SEARWEB_EXPOSE_UNAVAILABLE_TOOLS=trueso AI agents can discover optional tools - Uses a
.jsextension and no extra arguments for maximum MCP client compatibility
After editing, verify with:
opencode mcp list
opencode mcp debug searwebThe general pattern is:
- Command:
npx -y searweb(ornode /path/to/searweb/dist/index.js) - Transport:
stdio - Environment variables: Pass
OPENAI_API_KEY,OPENAI_MODEL, etc. - Timeout: Set to at least
30000ms (30 seconds)
Once configured, test that the tools are available:
# This should list: search_web_ddg, fetch_web_markdown, search_wikipedia, llm_research
# (search_web_searxng only appears if SearXNG is healthy)Try a simple query via your MCP client:
- Use
search_web_ddgwith query"current date" - Use
llm_researchwith query"What is Model Context Protocol?"and level"quick"
| Symptom | Cause | Fix |
|---|---|---|
llm_research tool missing |
OPENAI_API_KEY not set |
Add API key to env/environment. Or set SEARWEB_EXPOSE_UNAVAILABLE_TOOLS=true to make the tool visible with setup instructions. |
search_web_searxng missing |
SearXNG not running | Set SEARXNG_AUTO_START=true and ensure Docker is running. Or set SEARWEB_EXPOSE_UNAVAILABLE_TOOLS=true to make the tool visible. |
| Tool calls time out | Default timeout too short | Set "timeout": 30000 or higher |
| Research answers have no citations | LLM ignored prompt | Normal — retry with --level deep or more specific query |
| Empty research result | LLM failed to synthesize | Check API key balance and model availability |
| Calling a tool returns "NOT CONFIGURED" | Tool was exposed but not actually configured | Follow the setup_instructions in the error response and restart the MCP server |
- Always cite sources: When
llm_researchreturns an answer, thesourcesarray contains the URLs. Reference them explicitly in your response to the user. - Session IDs: If a research query is a follow-up, pass the previous
sessionIdto continue context. - Budget controls: The
levelparameter controls depth:quick: 1-3 loops, 2+ tools — fast, good for simple factsstandard: 3-8 loops, 5+ tools — balanceddeep: 6-15 loops, 8+ tools — thorough research
- SearXNG dependency: The SearXNG tool may be unavailable. Always fall back to
search_web_ddgif it is missing. - Discover hidden tools: If
llm_researchorsearch_web_searxngare missing from the tool list, ask the user to setSEARWEB_EXPOSE_UNAVAILABLE_TOOLS=truein the MCP config. With this flag, the tools will appear with[NOT CONFIGURED]in their descriptions, and calling them returns setup instructions that you can show to the user. - Privacy:
llm_researchsends queries and fetched page content to the configured LLM provider (default: OpenAI). Do not use it for sensitive personal data unless approved by the user.
Create a config.json or use environment variables:
{
"jinaApiKeys": ["your-jina-key"],
"searxngUrl": "http://localhost:8080",
"searxngAutoStart": true,
"llm": {
"provider": "openai",
"apiKey": "your-openai-key",
"model": "gpt-4o-mini"
}
}Environment variables (all optional, overrides config.json):
OPENAI_API_KEY— OpenAI API key for LLM researchOPENAI_MODEL— Model name (default:gpt-4o-mini)JINA_API_KEYS— Comma-separated Jina.ai API keysJINA_DISABLE_REMOTE— Disable remote Jina proxy (true/false)SEARXNG_URL— SearXNG instance URLSEARXNG_AUTO_START— Auto-start SearXNG container (true/false)SEARWEB_TRANSPORT— Transport mode:stdio(default) orsseSEARWEB_SSE_PORT— SSE server port (default: 3000)SEARWEB_EXPOSE_UNAVAILABLE_TOOLS— Expose SearXNG andllm_researchin MCP even if not configured. Calls return setup instructions instead of silently hiding the tool. Useful when the MCP client caches tool lists and you want AI agents to discover optional tools.
Recommended for MCP: Use environment variables (via your MCP client's
env/environmentfield) instead of config.json for API keys. This avoids storing secrets in files.
Search the web using DuckDuckGo.
searweb ddg "React hooks" --limit 10
searweb ddg "Python tutorial" --json # pipe-friendly JSON output
searweb ddg "AI news" --offset 30 # paginationFetch a webpage and convert to clean markdown.
searweb fetch https://github.com/modelcontextprotocol/specification
searweb fetch https://example.com --with-index # preserve navigation linksSearch Wikipedia articles.
searweb wiki "Artificial Intelligence" --lang en --limit 5
searweb wiki "Kunstliche Intelligenz" --lang deSearch via SearXNG metasearch. Auto-starts local Docker container if configured.
searweb xng "Rust programming" --limit 10
searweb xng "OpenAI news" --page 2 # pagination
searweb xng --status # Check engine health (CAPTCHA, rate limits, timeouts)AI-powered autonomous research with synthesis and inline citations.
Features:
- Agent Loop: LLM plans search strategy, fetches sources, and synthesizes answers
- Dual Budget:
maxLoops(reasoning rounds, upper limit) +minTools(tool calls, lower limit) - Tree Display: Real-time progress with loop budget indicators
- Session Persistence: Continue research later with
-s <id> - Citation Tracking: Every claim is cited with [^N^] referencing actual sources
- Citation Renumbering: Original source indices are normalized, deduplicated, and renumbered to a clean 1-N list matching
SOURCES
# Standard research (3-8 loops, 5+ tools)
searweb research "Latest advances in quantum computing"
# Quick research (1-3 loops, 2+ tools)
searweb research "What is Rust?" --level quick
# Deep research (6-15 loops, 10+ tools)
searweb research "Climate change mitigation strategies" --level deep
# Custom budget
searweb research "MCP protocol" --max-loops 5 --min-tools 3
# Continue a previous session
searweb research "Tell me more" --session abc12345
# List and manage sessions
searweb research --list
searweb research --rm abc12345 -y
# JSON output (no streaming, pipe-friendly)
searweb research "MCP protocol" --jsonResearch output (tree-style):
▶ Research: What is TypeScript?
├─ 🤔 thinking: The user wants to know what TypeScript is...
├─ [loop 1/3 | tools 2/2] ✅ min reached
├─ 🔍 search ddg "What is TypeScript" limit:10 → 10 results
└─ 🔍 search wiki "TypeScript" limit:5 → 5 results
├─ 🤔 thinking: I have initial results. Let me fetch key sources...
├─ [loop 2/3 | tools 4/2] ✅ min reached
├─ 📄 fetch www.typescriptlang.org → 4.9k chars
└─ 📄 fetch en.wikipedia.org/TypeScript → 10.0k chars
├─ [loop 3/3 | tools 6/2] ✅ min reached
├─ 📄 fetch builtin.com/typescript → 10.0k chars
└─ 📄 fetch www.w3schools.com/typescript_int... → 10.0k chars
────────────────────────────────────────────────────────────
ANSWER
────────────────────────────────────────────────────────────
**Executive Summary:** TypeScript is a high-level, statically typed superset
of JavaScript... [^1^][^2^]
## What TypeScript Is
- TypeScript is a **superset of JavaScript**... [^1^]
- It is **free and open-source**... [^2^]
...
└─ ✓ Done 3 loops, 6 tools, 3 sources
💾 Session saved: f0dda825 (use -s to continue)
────────────────────────────────────────────────────────────
SOURCES
────────────────────────────────────────────────────────────
1. https://en.wikipedia.org/wiki/TypeScript
2. https://www.typescriptlang.org/
3. https://builtin.com/software-engineering-perspectives/typescript
Interactive configuration wizard. Guides you through:
- Jina.ai API keys
- SearXNG Docker setup
- LLM provider configuration
- OpenCode integration
Start the MCP server explicitly.
searweb server
searweb server /path/to/config.jsonSearweb follows a core + app architecture:
searweb/
├── src/
│ ├── core/ # Pure logic layer (no UI, no globals)
│ │ ├── search/ # DDG, SearXNG, Wikipedia
│ │ ├── fetch/ # Jina client, rule engine, caching
│ │ ├── research/ # LLM research with synthesis & citation renumbering
│ │ ├── rules/ # YAML-based site cleanup rules
│ │ ├── docker/ # SearXNG container management
│ │ └── index.ts # createCore(config, logger) factory
│ ├── app/
│ │ ├── mcp/ # MCP protocol wrapper
│ │ └── cli/ # Human CLI with formatting
│ └── index.ts # Unified entry point (routes MCP/CLI)
Key design decisions:
- Core layer is pure logic: no
console.log, no global state, no UI assumptions createCore(config, logger)factory injects all dependencies- App layers only handle presentation: MCP wraps results in JSON, CLI formats for terminal
- Research synthesis: Agent Loop gathers sources; a separate synthesis pass generates the final answer with renumbered citations
- Citation integrity: URLs are normalized (
decodeURIComponent, strip hash, trim trailing slash), deduplicated, and renumbered into a contiguous 1-N list
Site-specific cleanup rules are defined in YAML files under rules/.
Example (rules/github-file.yaml):
name: github-file
description: Clean up GitHub file pages
match:
domains: [github.com]
paths: [/{owner}/{repo}/blob/{branch}/{*path}]
sources:
- name: github-raw
type: redirect
url: https://raw.githubusercontent.com/{owner}/{repo}/{branch}/{path}
validate:
minLength: 100
on_error:
action: continue
- name: original
type: original
process:
- when: "source == 'github-html'"
actions:
- action: remove_section
from: '## About'
to: end
- action: remove_consecutive_links
threshold: 5Actions: remove_until, remove_from, remove_section, remove_lines_matching, remove_consecutive_links, replace, mark.
See rules/ directory for more examples.
npm install
npm run build
npm start # MCP server mode
npm run setup # Configuration wizard
npm test # Run test suite (vitest)
npm run test:coverage # Coverage reportSearweb uses Vitest for unit testing. All core logic is covered:
npm test # Run once
npm run test:watch # Watch mode
npm run test:coverage # With coverage reportCurrent test coverage: 53 tests covering config loading, session store (LRU eviction), prompt building, tool definitions, answer synthesis, and citation renumbering.
Verify MCP server is working:
# Test stdio mode (should output JSON-RPC initialize response)
echo '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"test","version":"1.0"}}}' | npx searwebOpenCode specific:
opencode mcp list # List loaded MCP servers
opencode mcp debug searweb # Debug searweb MCP connection
opencode mcp auth searweb # Trigger OAuth (if configured)Claude Desktop specific:
- Logs are in
~/Library/Logs/Claude/(macOS) or%APPDATA%\Claude\logs\(Windows) - Look for
mcp-server-searweb.log
Common issues:
npx searwebtimes out: Add"timeout": 30000to your MCP config- SearXNG tool missing: Check
searweb xng --statusin CLI - Research tool missing: Ensure
OPENAI_API_KEYis set - Empty research answer: Check LLM API key balance and model availability
MIT