Production-grade search engine with Learning-to-Rank, neural re-ranking, hybrid retrieval (BM25 + ANN), and real-time query understanding.
graph TD
Q[User Query] --> QU[Query Understanding]
QU -->|Intent + Entities| BM25[BM25 Retriever]
QU -->|Expanded Query| ANN[ANN Retriever]
BM25 --> Hybrid[Hybrid Fusion RRF]
ANN --> Hybrid
Hybrid --> FE[Feature Extractor]
FE --> LTR[Learning-to-Rank]
LTR --> RR[Reranker Cross-Encoder]
RR --> Results[Ranked Results]
Cache[Search Cache] -.->|TTL Lookup| Q
Q -.->|Cache Miss| QU
style Q fill:#4a90d9,color:#fff
style Results fill:#27ae60,color:#fff
style Cache fill:#f39c12,color:#fff
| Stage | Technology | Purpose |
|---|---|---|
| Query Understanding | Intent classifier + Entity extractor + Query expansion | Understand what user wants |
| BM25 Retrieval | Custom inverted index (Okapi BM25) | Keyword-based matching |
| ANN Retrieval | FAISS IVF-PQ + Sentence-BERT | Semantic vector search |
| Hybrid Fusion | RRF (Reciprocal Rank Fusion) | Combine BM25 + ANN scores |
| Feature Extraction | 10+ ranking signals | Prepare features for LTR |
| Learning-to-Rank | LightGBM LambdaRank | Pointwise ranking optimization |
| Neural Reranking | Cross-encoder (MS MARCO MiniLM) | Fine-grained relevance scoring |
| Caching | Redis / In-memory (TTL-based) | Low-latency repeated queries |
| Monitoring | Prometheus metrics + Health checks | Production observability |
# Install dependencies
pip install -r requirements.txt
# Start the API
uvicorn serving.search_api:app --host 0.0.0.0 --port 8000 --reloadcurl -X POST http://localhost:8000/index \
-H "Content-Type: application/json" \
-d '{"doc_id": 1, "title": "Python tutorials", "body": "Learn Python programming"}'curl "http://localhost:8000/search?q=python+tutorial&top_k=5"curl -X POST http://localhost:8000/index/batch \
-H "Content-Type: application/json" \
-d '[{"doc_id": 1, "title": "Doc 1", "body": "Content 1"}, {"doc_id": 2, "title": "Doc 2", "body": "Content 2"}]'python scripts/train_ltr.pyProduces models/ltr_model.txt (LightGBM LambdaRank).
| Metric | Description |
|---|---|
| NDCG@1/3/5/10 | Normalized Discounted Cumulative Gain |
| MAP@5/10 | Mean Average Precision |
| Recall@10 | Recall at top 10 |
| MRR@10 | Mean Reciprocal Rank |
kubectl create namespace search
kubectl apply -f k8s/Includes: Deployment (3 replicas), HPA (auto-scale to 20), ConfigMap, Ingress (TLS), PVC (10Gi models + 50Gi index).
docker build -t search-ranking .
docker run -p 8000:8000 search-rankingconfig/search_config.yaml controls all pipeline stages:
- Retrieval weights (BM25 k1/b, ANN ef_search, hybrid RRF weights)
- LTR hyperparameters (num_leaves, learning_rate, n_estimators)
- Reranker model selection
- Caching TTL and backend
- Evaluation metrics
search_ranking/
βββ indexing/ # Inverted index + FAISS vector index
βββ retrieval/ # BM25, ANN, Hybrid retrievers
βββ query/ # Intent classifier, entity extraction, expansion
βββ ranking/ # Feature extraction, LTR, neural reranker
βββ serving/ # FastAPI app + search pipeline orchestrator
βββ evaluation/ # NDCG, MAP, MRR, Recall metrics
βββ caching/ # Redis/in-memory result cache
βββ k8s/ # Kubernetes manifests
βββ scripts/ # LTR training pipeline
βββ config/ # YAML configuration