This project builds an end-to-end logistics network intelligence framework for analyzing shipment movement, identifying bottlenecks, predicting ETA delays, and optimizing transportation mode decisions between Full Truck Load (FTL) and Carting operations.
The pipeline combines:
- Exploratory Data Analysis (EDA)
- Graph-based logistics network construction
- Bottleneck and corridor analysis
- ETA prediction using ML models
- Graph embeddings using Node2Vec
The system transforms raw shipment movement data into operational intelligence for network strategy teams.
View Report Here: https://drive.google.com/file/d/1V_9cDIRs3gj0Edc0-NrF5iAJBZSEyPs2/view?usp=sharing
Modern logistics networks face:
- Hub congestion
- Unpredictable delivery delays
- Corridor inefficiencies
- Poor transport mode allocation
- Revenue leakage due to SLA failures
The objective of this project is to:
- Identify high-friction hubs and corridors
- Predict shipment ETA accurately
- Quantify operational bottlenecks
- Recommend FTL vs Carting decisions
- Estimate business and revenue impact
├── notebooks/
├── 01_eda.ipynb
├── 02_graph_construction.ipynb
├── 03_bottleneck_analysis.ipynb
├── 04_eta_prediction.ipynb
├── 05_graph_embeddings.ipynb
├── data/
└── report/Notebook: 01_eda.ipynb
Performed:
- Null value analysis
- Delay ratio calculation
- Segment-wise delay analysis
- Time distribution analysis
- Shipment duration trends
Key engineered features:
delay_ratio = actual_time / osrm_timesegment_delay_ratio- Hub-level delay aggregation
Insights:
- Significant variance exists between predicted OSRM travel time and actual shipment movement.
- Certain hubs consistently contribute to cascading delays.
- Long-haul corridors show nonlinear delay amplification.
Notebook: 02_graph_construction.ipynb
The logistics network is modeled as a directed weighted graph.
- Logistics hubs
- Source centers
- Destination centers
- Shipment movement corridors
- Median delay ratio
- Corridor traffic frequency
- Degree centrality
- Betweenness centrality
- Connectivity analysis
Why graph modeling? Traditional tabular analytics fail to capture network dependencies. Graphs enable:
- Bottleneck identification
- Route criticality estimation
- Corridor dependency analysis
- Network flow optimization
Notebook: 03_bottleneck_analysis.ipynb
This stage identifies:
- High-delay corridors
- Congested hubs
- Critical transit dependencies
- Aggregate median delay ratio per corridor
- Rank corridors by delay intensity
- Analyze centrality metrics
- Identify hubs with high transit dependency
- Hubs with high betweenness centrality are operationally critical.
- Small disruptions at these hubs create cascading SLA failures.
- Certain corridors exhibit structurally high delay ratios and require intervention.
Notebook: 04_eta_prediction.ipynb
Machine learning models are used to predict shipment ETA.
- Source center
- Destination center
- OSRM time
- Segment travel features
- Centrality features
- Corridor statistics
Potential models used:
- LightGBM
- CatBoost
- XGBoost
- Ensemble learning
Minimize ETA prediction error and improve shipment planning reliability.
- Better customer SLA adherence
- Dynamic dispatch optimization
- Reduced operational uncertainty
Notebook: 05_graph_embeddings.ipynb
Node2Vec is used to learn dense vector embeddings for hubs.
Graph embeddings capture:
- Structural similarity
- Traffic behavior similarity
- Transit role similarity
- Better ML feature representation
- Improved ETA prediction
- Corridor similarity clustering
- Hidden bottleneck discovery
Determine the optimal transportation mode based on:
- Shipment volume
- Corridor stability
- Delay risk
- Cost efficiency
- Delivery urgency
Dedicated vehicle assigned to a shipment.
Advantages:
- Faster transit
- Lower handling risk
- Better SLA consistency
- Reduced touchpoints
Disadvantages:
- Higher cost for low utilization
- Inefficient for fragmented loads
Shared shipment consolidation model.
Advantages:
- Lower cost
- Better utilization
- Efficient for low-volume shipments
Disadvantages:
- Higher transit uncertainty
- Multiple handling points
- Greater delay probability
| Parameter | FTL Preferred | Carting Preferred |
|---|---|---|
| Shipment Volume | High | Low |
| Corridor Delay Variance | Low | Moderate |
| SLA Criticality | High | Medium |
| Transit Frequency | Stable | Variable |
| Shipment Value | High | Low |
| Cost Sensitivity | Medium | High |
| Delivery Urgency | High | Low |
Recommended for:
- High-frequency corridors
- Enterprise clients
- Time-sensitive shipments
- Stable long-haul routes
Expected Benefits:
- 15–25% lower delay probability
- Better ETA reliability
- Reduced re-routing risk
Recommended for:
- Low shipment density corridors
- Non-critical deliveries
- Rural or fragmented networks
Expected Benefits:
- Better fleet utilization
- Lower operating cost
- Higher network flexibility
Characteristics:
- High delay ratio
- Large variance
- Multi-hop dependency
Recommended Action:
- Shift high-priority loads to FTL
- Add transit buffers
- Introduce dynamic routing
Characteristics:
- Predictable transit time
- Lower congestion
- Balanced throughput
Recommended Action:
- Continue carting optimization
- Improve consolidation efficiency
Operational improvements can generate revenue impact through:
- Reduced SLA penalties
- Improved customer retention
- Higher shipment throughput
- Lower idle fleet cost
- Better route planning
Estimated Impact Areas:
- 8–12% reduction in delay-related penalties
- 10–18% operational efficiency gain
- Improved fleet utilization
- Better customer satisfaction metrics
- Real-time traffic integration
- Dynamic route optimization
- Reinforcement learning for dispatch
- Live congestion prediction
- Multi-objective network optimization
| Category | Tools |
|---|---|
| Language | Python |
| Data Analysis | Pandas, NumPy |
| Visualization | Matplotlib, Seaborn |
| Graph Analytics | NetworkX |
| ML Models | LightGBM, CatBoost, XGBoost |
| Graph Embeddings | Node2Vec |
| Notebook Environment | Jupyter |
This project demonstrates how graph analytics and machine learning can transform logistics operations into a data-driven intelligent network.
The combined use of:
- Graph centrality analysis
- Corridor bottleneck detection
- ETA prediction
- Node embeddings
- Strategic FTL vs Carting optimization
creates a scalable framework for logistics network intelligence and operational decision-making.