An end-to-end, event-driven architecture simulating a real-world delivery tracking system (like Swiggy, Zomato, or UberEats). Built with Node.js, Apache Kafka, Python, WebSockets, MongoDB, and Leaflet.js.
[ Rider Simulator (Python) ]
│
├─ Fetch Real-world Route (OSRM API)
├─ Step through coordinates
↓
[ Kafka Producer ]
│
├─ Topic: rider-location
↓
[ Apache Kafka Broker (KRaft) ]
│
↓
[ Kafka Consumer (Node.js) ]
│
├─ Save data to MongoDB (Mongoose)
├─ Broadcast updates via WebSockets
↓
[ Frontend Dashboard (HTML/JS + Leaflet) ]
│
└─ Listen to WebSockets & Animates 🛵 Markers live on Map
- Highly Scalable Event Streaming (Apache Kafka): Ingests thousands of GPS coordinates asynchronously across
rider-location,rider-predictions,traffic-density, andrider-alertstopics. - PySpark Structured Streaming: Replaces slow database queries by processing telemetry natively in the streaming layer over a 10-second sliding window.
- Machine Learning at the Edge: Scikit-Learn models (
RandomForestRegressorandIsolationForest) are embedded directly into PySpark UDFs to calculate real-time AI ETAs and detect "Ghost Rider" GPS fraud instantly. - Spatial Heatmaps (Uber H3 Indexing): Replaces scattered UI dots with dynamic, color-coded hexagonal aggregations (Green/Orange/Red) representing clustered traffic density.
- Dynamic Geofencing: Real-time bounded polygon checks alerting when riders enter restricted zones (e.g., HITEC City).
- Real-Time WebSockets: Frontend subscribes via Socket.io to receive live updates with sub-second latency from the Node.js consumer.
- Custom UI Integration: Maps powered by Leaflet and OpenStreetMap native tiles, customized with SVG drop-shadow rider tokens that pulse red on fraud detection.
delivery-tracking/
│
├── backend/
│ ├── package.json # Express, KafkaJS, Mongoose, Socket.io
│ └── server.js # Consumer + WebSocket Server + MongoDB schemas
│
├── producer/
│ ├── requirements.txt
│ ├── simulator.py # Advanced OSRM route coordinate simulator
│ └── benchmark.py # Kafka throughput load-stress generator (10k+ riders)
│
├── spark/
│ ├── train_models.py # Scikit-Learn Offline Training (RandomForest/IsolationForest)
│ ├── stream_processor.py # PySpark Streaming Engine (H3, ML Inference, Windowing)
│ └── models/ # Pre-trained .joblib intelligence files
│
└── frontend/
├── index.html # Leaflet.js Interactive frontend mapping & AI Status
└── mobile-tracker.html # HTML5 Geolocation API interface for real 5G tracking
Make sure you have installed:
- Java Development Kit (JDK 11+) - Required by Kafka.
- Apache Kafka - Downloaded and configured locally in
C:\kafka. - Node.js (v16+) - For the backend.
- Python (3.x) - For the simulator.
- MongoDB - Running locally on port
27017or configured via Atlas.
Open a Git Bash terminal. Format storage (first time only) and start the server:
cd /c/kafka
# Format storage (replace <uuid> if first time run: ./bin/windows/kafka-storage.bat random-uuid)
# ./bin/windows/kafka-storage.bat format -t <uuid> -c ./config/server.properties
./bin/windows/kafka-server-start.bat ./config/server.propertiesOpen a new Git Bash terminal while Kafka is running. Create the required topic(s):
cd /c/kafka
# Core topic used by producer and backend
./bin/windows/kafka-topics.bat --create --topic rider-location --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1
# Optional research topics for upcoming analytics features
./bin/windows/kafka-topics.bat --create --topic traffic-density --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1
./bin/windows/kafka-topics.bat --create --topic rider-alerts --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1
./bin/windows/kafka-topics.bat --create --topic rider-predictions --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1
# Verify topics
./bin/windows/kafka-topics.bat --list --bootstrap-server localhost:9092If a topic already exists, Kafka will report it. You can safely continue.
Open a Git Bash terminal. Train the Machine Learning models using the historical data synthesizer.
cd delivery-tracking/spark
source .venv/Scripts/activate
python -m ensurepip --default-pip
python -m pip install -r requirements.txt
python train_models.py(You will see .joblib files generated inside the spark/models/ folder).
In the same Spark terminal, start the stream processing engine that applies the ML models and Spatial H3 logic natively over Kafka.
python stream_processor.pyOpen a new Git Bash terminal. This hooks into MongoDB and listens to the Kafka ML prediction/density/alert topics.
cd delivery-tracking/backend
npm install
node server.jsOpen a third Git Bash terminal. This processes OSRM routes and begins producing data into Kafka every 2 seconds.
cd delivery-tracking/producer
python -m pip install -r requirements.txt
python simulator.py(Alternatively, run python benchmark.py to stress-test your system with 10,000 riders).
Navigate to the delivery-tracking/frontend/ folder in your Windows File Explorer and double-click index.html to open it in your browser.
You will immediately see 🛵 scooters tracking live along the streets, predicting ETAs, glowing red on anomalies, and projecting traffic hit-maps cleanly via HTML5 WebSocket events!
To track a real mobile phone on 4G/5G and view the live dashboard from anywhere, we use a hybrid deployment architecture:
- Frontend: Hosted on Vercel (Static HTML/JS).
- Backend / Kafka: Hosted locally on your machine, tunneled securely to the internet via Ngrok.
- Download and authenticate Ngrok on your backend machine.
- In a terminal, expose your Node.js Backend port (default
3001):ngrok http 3001
- Copy the output
Forwarding URL(e.g.,https://vannesa-unflaked-zoraida.ngrok-free.dev).
- Open
frontend/config.js. - Update the
DEPLOYED_URLvariable to your new Ngrok URL:const DEPLOYED_URL = "https://<your-ngrok-id>.ngrok-free.dev"; const BACKEND_URL = DEPLOYED_URL;
- Commit and push your changes to GitHub.
- Go to Vercel and import your GitHub repository.
- Set the "Framework Preset" to Other and the "Root Directory" to
frontend. - Click Deploy. Your frontend is now available globally!
(Note: Ngrok's free tier shows an interstitial browser warning screen when visiting a URL. The frontend architecture automatically injects the HTTP Header "ngrok-skip-browser-warning": "69420" into all Socket.io & Fetch requests to silently bypass this block without needing a paid Ngrok account).
- Start your
server.jsand your Ngrok tunnel on your laptop. - Open your Vercel deployment URL on your smartphone (e.g.,
https://kafka-delivery-tracking.vercel.app/mobile-tracker.html). - Enter your assigned
Rider IDand press Start Tracking Me. - Open the map dashboard (
index.htmlon Vercel) on your laptop or another device, and watch your real-world movements stream directly into your local Kafka broker..!