Author: Philip Van de Walker | @trafflux | https://github.com/trafflux
A Model Context Protocol (MCP) server that provides GitHub Copilot and other MCP clients with access to RAGFlow's document retrieval capabilities.
Step 1: Build locally
docker build -t ragflow-mcp-server:local .# Add from Docker MCP Registry
docker mcp server add ragflow-mcp-serverConfigure via Docker Desktop UI (Settings → MCP Toolkit → RAGFlow MCP Server).
This MCP server enables AI assistants like GitHub Copilot to search and retrieve relevant documents from RAGFlow datasets using natural language queries. It implements the MCP protocol for seamless integration with supported clients.
- Document Retrieval: Search across RAGFlow datasets using natural language queries
- Structured Results: Returns formatted JSON with document chunks, metadata, and pagination
- MCP Protocol Compliant: Full compatibility with MCP 2024-11-05 specification
- Production Ready: Robust error handling, logging, and async operations
- Cross-Platform: Works on Windows, Linux, macOS, and Docker environments
- Python 3.12+
- Access to a running RAGFlow instance
- RAGFlow API key
-
Install dependencies:
uv sync
-
Configure environment variables:
cp .env .env.local # Edit .env.local with your RAGFlow API key and base URL
Set the following environment variables:
RAGFLOW_API_KEY=your-api-key-here
RAGFLOW_BASE_URL=http://localhost:9380The server uses stdio transport for MCP compliance:
# Using environment variables
python3 -m mcp_app
# Or with explicit CLI options
python3 -m mcp_app --ragflow-api-key your-key --ragflow-base-url http://localhost:9380For a containerized deployment that doesn't require a separate running server, add the following to your VS Code MCP configuration. This approach runs the MCP server as an ephemeral Docker container for each request:
{
"mcpServers": {
"ragflow-mcp": {
"command": "docker",
"args": [
"run",
"-i",
"--rm",
"-e",
"RAGFLOW_API_KEY=ragflow-xxxxxxxxx",
"-e",
"RAGFLOW_BASE_URL=http://host.docker.internal:9380",
"--add-host=host.docker.internal:host-gateway",
"--network",
"devnet",
"ragflow-mcp-server:local"
]
}
}
}RAGFLOW_API_KEY: Your RAGFlow API key (replaceragflow-xxxxxxxxxwith your actual key)RAGFLOW_BASE_URL: URL of your RAGFlow instance (default:http://host.docker.internal:9380for local Docker setups)
--rm: Automatically remove the container when it exits-i: Keep STDIN open for interactive communication--add-host=host.docker.internal:host-gateway: Maps host.docker.internal to the Docker host (for accessing services on the host machine)--network devnet: Connects to the specified Docker network where RAGFlow is running
-
Build the Docker image:
docker build -t ragflow-mcp-server:local . -
Ensure RAGFlow is accessible: The container needs to reach your RAGFlow instance via the configured
RAGFLOW_BASE_URL -
Network configuration: Make sure the
--networkargument matches the Docker network where your RAGFlow instance is running
For development or when Docker is not preferred, use the direct Python execution:
{
"mcpServers": {
"ragflow": {
"command": "python3",
"args": ["-m", "mcp_app"],
"cwd": "/path/to/ragflow-mcp-server",
"env": {
"RAGFLOW_API_KEY": "your-api-key",
"RAGFLOW_BASE_URL": "http://localhost:9380"
}
}
}
}The server supports the standard MCP stdio transport protocol and can be configured with any MCP-compatible client.
Search RAGFlow datasets and retrieve relevant documents.
Parameters:
question(string, required): The search querydataset_ids(array of strings, optional): Specific dataset IDs to searchdocument_ids(array of strings, optional): Specific document IDs to filterpage(integer, optional): Page number for pagination (default: 1)page_size(integer, optional): Results per page (default: 10)similarity_threshold(float, optional): Minimum similarity score (default: 0.2)vector_similarity_weight(float, optional): Weight for vector vs keyword search (default: 0.3)keyword(boolean, optional): Enable keyword-based search (default: false)top_k(integer, optional): Maximum candidates to consider (default: 1024)rerank_id(string, optional): Reranking model identifierforce_refresh(boolean, optional): Force refresh cached metadata (default: false)
Response Format:
{
"chunks": [
{
"id": "chunk-id",
"content": "Document content...",
"content_ltks": "processed content...",
"dataset_id": "dataset-id",
"document_id": "document-id",
"document_keyword": "filename.md",
"highlight": "highlighted <em>content</em>...",
"important_keywords": ["keyword1", "keyword2"],
"positions": [[1, 2, 3]],
"similarity": 0.85,
"term_similarity": 0.8,
"vector_similarity": 0.9
}
],
"pagination": {
"page": 1,
"page_size": 10,
"total_chunks": 25,
"total_pages": 3
},
"query_info": {
"question": "your search query",
"similarity_threshold": 0.2,
"vector_weight": 0.3,
"keyword_search": false,
"dataset_count": 1
}
}The MCP server consists of:
- mcp_app.py: Main MCP server implementation with tool definitions and request handling
- RAGFlowConnector: Async HTTP client for communicating with RAGFlow API
- MCP Protocol Layer: Handles MCP protocol messages and tool execution
Important: This server uses lazy initialization of the aiohttp session within FastMCP's managed event loop. This is critical for compatibility:
- ❌ DON'T: Pre-initialize the aiohttp session before
FastMCP.run()(creates wrong event loop) - ✅ DO: Use
_ensure_connector_initialized()to initialize on first tool invocation (uses correct event loop)
Why this matters: aiohttp's timeout context manager must be created in the same event loop where tools execute. Initializing in a different loop causes:
RuntimeError: Timeout context manager should be used inside a task
See TIMEOUT_FIX_FINAL.md for detailed technical explanation.
- Max Connections: 100 total, 10 per host
- DNS TTL: 300 seconds
- Session Reuse: Connections pooled and reused across tool invocations
- Cleanup: Automatic on server shutdown
- Datasets Cache: LRU with 32 max items, 300s TTL
- Documents Cache: LRU with 128 max items, 300s TTL
- Cache Key: Dataset/document ID + parameters
ragflow-mcp-server/
├── mcp_app.py # Main MCP server (FastMCP)
├── pyproject.toml # Python project configuration
├── uv.lock # Dependency lock file
├── Dockerfile # Docker container configuration
├── docker-entrypoint.sh # Docker entrypoint script
├── tools.json # MCP tool definitions
├── .env # Environment variables (example)
├── README.md # This documentation
└── IMPLEMENTATION.md # Technical implementation details
Run the MCP server and use MCP client tools to test:
# Test with MCP inspector
npx @modelcontextprotocol/inspector python3 -m mcp_app
# Or test directly with environment variables set
RAGFLOW_API_KEY=your-key RAGFLOW_BASE_URL=http://localhost:9380 \
npx @modelcontextprotocol/inspector python3 -m mcp_appBuild the Docker image for local use or deployment:
# Build with local tag
docker build -t ragflow-mcp-server:local .
# Or build with a specific version tag
docker build -t ragflow-mcp-server:v1.0.0 .When running the container, configure these environment variables:
# Required: RAGFlow API credentials
RAGFLOW_API_KEY=your-ragflow-api-key-here
RAGFLOW_BASE_URL=http://host.docker.internal:9380
# Optional: Logging and debugging
LOG_LEVEL=INFO # DEBUG, INFO, WARNING, ERRORFor testing or development:
docker run -i --rm \
-e RAGFLOW_API_KEY=ragflow-xxxxxxxxx \
-e RAGFLOW_BASE_URL=http://host.docker.internal:9380 \
--add-host=host.docker.internal:host-gateway \
--network devnet \
ragflow-mcp-server:localFor production use, consider:
# Run with resource limits
docker run -d \
--name ragflow-mcp-server \
--restart unless-stopped \
--memory 512m \
--cpus 0.5 \
-e RAGFLOW_API_KEY=ragflow-xxxxxxxxx \
-e RAGFLOW_BASE_URL=http://your-ragflow-host:9380 \
ragflow-mcp-server:localAdd to your docker-compose.yml:
services:
ragflow-mcp:
build: .
environment:
- RAGFLOW_API_KEY=ragflow-xxxxxxxxx
- RAGFLOW_BASE_URL=http://ragflow:9380
networks:
- ragflow-network
depends_on:
- ragflow- "Tool not found": Ensure the MCP server is running and properly configured in your client
- "Connection failed": Check RAGFlow API key and base URL in environment variables
- "No results": Verify datasets exist and contain searchable content
- "Timeout": Increase timeout values or check RAGFlow server performance
The server provides detailed logging. Check your MCP client's logs for error messages and set LOG_LEVEL=DEBUG for verbose output.
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
This server is designed to be listed in the Official Docker MCP Registry. To contribute this server to the registry:
- Review the Docker MCP Registry Contributing Guide
- This repository includes a compliant
server.yamlfile that can be submitted - The
tools.jsonfile provides tool metadata for the registry - Submit a pull request to the Docker MCP Registry with the server metadata
For more details, see the Docker MCP Toolkit Documentation.
This project is part of the RAGFlow ecosystem. Licensed under Apache License 2.0. │ RAGFlow Backend │ │ Port 9380 │ │ (REST API) │ └────────────────────────────┘
## Installation
### Option 1: Docker Container (Recommended)
Build and run the MCP server in a Docker container:
```bash
# Build the image
docker build -t ragflow-mcp-server:local .
# Run with environment variables
docker run -i --rm \
-e RAGFLOW_API_KEY=your-api-key \
-e RAGFLOW_BASE_URL=http://host.docker.internal:9380 \
--add-host=host.docker.internal:host-gateway \
--network devnet \
ragflow-mcp-server:local
Install and run directly with Python:
# Install dependencies
uv sync
# Set environment variables
export RAGFLOW_API_KEY=your-api-key
export RAGFLOW_BASE_URL=http://localhost:9380
# Run the server
python3 -m mcp_app# Required: RAGFlow API credentials
RAGFLOW_API_KEY=your-ragflow-api-key-here
RAGFLOW_BASE_URL=http://localhost:9380
# Optional: Logging level
LOG_LEVEL=INFO # DEBUG, INFO, WARNING, ERRORFor containerized deployment, add to your docker-compose.yml:
services:
ragflow-mcp:
build: .
environment:
- RAGFLOW_API_KEY=your-api-key
- RAGFLOW_BASE_URL=http://ragflow:9380
networks:
- ragflow-network
depends_on:
- ragflowUpdate your VS Code MCP settings to include the RAGFlow server:
For Docker-based setup:
{
"mcpServers": {
"ragflow-mcp": {
"command": "docker",
"args": [
"run",
"-i",
"--rm",
"-e",
"RAGFLOW_API_KEY=your-api-key",
"-e",
"RAGFLOW_BASE_URL=http://host.docker.internal:9380",
"--add-host=host.docker.internal:host-gateway",
"--network",
"devnet",
"ragflow-mcp-server:local"
]
}
}
}For direct Python setup:
{
"mcpServers": {
"ragflow": {
"command": "python3",
"args": ["-m", "mcp_app"],
"cwd": "/path/to/ragflow-mcp-server",
"env": {
"RAGFLOW_API_KEY": "your-api-key",
"RAGFLOW_BASE_URL": "http://localhost:9380"
}
}
}
}Important: After updating the config:
- Save the file
- Close VS Code completely
- Reopen VS Code
- Copilot should now show RAGFlow tools available
Test the server using the official MCP inspector:
# Install MCP inspector if not already installed
npm install -g @modelcontextprotocol/inspector
# Test with environment variables
RAGFLOW_API_KEY=your-key RAGFLOW_BASE_URL=http://localhost:9380 \
mcp-inspector python3 -m mcp_appYou can also test the server manually by piping JSON-RPC messages:
# Initialize the server
echo '{"jsonrpc": "2.0", "id": 1, "method": "initialize", "params": {"protocolVersion": "2024-11-05", "capabilities": {}, "clientInfo": {"name": "test", "version": "1.0"}}}' | \
RAGFLOW_API_KEY=your-key RAGFLOW_BASE_URL=http://localhost:9380 \
python3 -m mcp_app
# List available tools
echo '{"jsonrpc": "2.0", "id": 2, "method": "tools/list", "params": {}}' | \
RAGFLOW_API_KEY=your-key RAGFLOW_BASE_URL=http://localhost:9380 \
python3 -m mcp_appFor integration testing with actual RAGFlow data:
# Set your actual environment variables
export RAGFLOW_API_KEY="your-actual-api-key"
export RAGFLOW_BASE_URL="http://your-ragflow-host:9380"
# Test a retrieval query
echo '{
"jsonrpc": "2.0",
"id": 3,
"method": "tools/call",
"params": {
"name": "ragflow_retrieval",
"arguments": {
"question": "What is RAGFlow?",
"page_size": 5
}
}
}' | python3 -m mcp_appresponse = requests.post( "http://localhost:9382/mcp/", headers=headers, json=request, )
print(response.status_code) # 202 print(response.text) # SSE stream with tool list
## Available Tools
### ragflow_retrieval
Search RAGFlow datasets and retrieve relevant documents.
**Parameters:**
- `question` (string, required): The search query
- `dataset_ids` (array, optional): Specific datasets to search
- `document_ids` (array, optional): Specific documents to search within
- `page` (integer, default: 1): Results page number
- `page_size` (integer, default: 10, max: 100): Results per page
- `similarity_threshold` (number, default: 0.2): Minimum match similarity
- `vector_similarity_weight` (number, default: 0.3): Vector vs keyword weight
- `keyword` (boolean, default: false): Enable keyword search
- `top_k` (integer, default: 1024): Candidate pool size
- `rerank_id` (string, optional): Reranking model
- `force_refresh` (boolean, default: false): Skip cache
**Example Response:**
Search Results:
-
RAGFlow is a sophisticated document analysis platform... Source: documentation/overview.md Relevance: 0.95
-
It supports various document formats including PDF... Source: documentation/features.md Relevance: 0.87
## Troubleshooting
### Copilot can't find the server
1. **Verify server is running:**
```bash
curl http://localhost:9382/health
-
Check firewall: Windows Firewall might block port 9382. Allow it in Windows Defender.
-
Check configuration:
- Verify
mcp.jsonhas correct URL with trailing slash:/mcp/ - Verify Bearer token matches exactly
- Close and reopen VS Code after changes
- Verify
-
Check logs:
docker logs ragflow-server | grep -i mcp
- Ensure RAGFlow backend is running and accessible
- Check
MCP_BASE_URLsetting - Verify API key is valid
If getting JSON directly instead of SSE stream:
- Ensure
json_response=Falsein server code - Container might be using old image; rebuild:
docker-compose build --no-cache ragflow
environment:
- MCP_MODE=http
- MCP_HOST=0.0.0.0 # Don't restrict to localhost
- MCP_PORT=9382- Run behind a reverse proxy (nginx)
- Use proper SSL/TLS certificates
- Set up monitoring and alerting
- Configure resource limits in Docker
Edit mcp_app.py and add to the list_tools() handler:
@self.mcp_server.list_tools()
async def list_tools() -> list[Tool]:
return [
# ... existing tools
Tool(
name="my_tool",
description="Tool description",
inputSchema={ ... }
)
]Then handle in call_tool():
@self.mcp_server.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
if name == "my_tool":
# Implement tool logic
pass# Install in development mode
uv sync
# Run tests (if available)
python -m pytest
# Run the server
python3 -m mcp_appThe new server is fully backward compatible but requires:
- Update
docker-compose.yml(command section removed) - Ensure environment variables are set
- Restart containers:
docker-compose restart - Reload VS Code
- MCP Specification: https://spec.modelcontextprotocol.io/
- RAGFlow: https://github.com/infiniflow/ragflow
- GitHub Issues: [Report issues here]
Same as RAGFlow (Apache 2.0)
Version: 2.0.0
Protocol: MCP 2024-11-05
Last Updated: October 2025