Skip to content

Devendra2610/Multiple-Linear-Regression

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

🏎️ Formula 1 Lap Time Prediction

Predicting Race Outcomes with Simple & Multiple Linear Regression

Open In Colab Python scikit-learn License: MIT


📌 Project Overview

In Formula 1, every tenth of a second matters. Race engineers use predictive models to decide when to pit, which tyre compound to deploy, and how fast a car will degrade over a stint.

This project applies Simple and Multiple Linear Regression to real F1 telemetry data (101,371 laps across the 2022–2025 seasons) to predict lap times — a real problem in modern motorsport analytics.


📊 Dataset

Property Value
Name F1 Strategy Dataset v4
Rows 101,371 individual laps
Features 16 columns
Seasons 2022, 2023, 2024, 2025
Target LapTime (s) — continuous numerical

Key Features Used:

  • TyreLife — laps completed on the current set of tyres
  • LapNumber — current lap in the race
  • RaceProgress — fraction of race completed (0–1)
  • Position — current race position
  • Compound — tyre type (SOFT / MEDIUM / HARD)

🧠 Models Built

Simple Linear Regression

  • Feature: TyreLife (single predictor)
  • Equation: LapTime = m × TyreLife + b

Multiple Linear Regression

  • Features: TyreLife, LapNumber, RaceProgress, Position, Compound_Speed
  • Equation: LapTime = m₁×TyreLife + m₂×LapNumber + m₃×RaceProgress + m₄×Position + m₅×Compound_Speed + b

📈 Results

Metric Simple LR Multiple LR Improvement
MAE — s — s ↓ ~X%
RMSE — s — s ↓ ~X%
R² Score ↑ + X

Run the notebook to see your actual metrics filled in!


🗂️ Project Structure

f1-lap-time-regression/
│
├── F1_LapTime_Regression.ipynb   ← Main notebook (run this!)
├── f1_strategy_dataset_v4.csv    ← Dataset
├── README.md                     ← You are here
└── plots/                        ← Exported visualizations
    ├── plot1_lapdist.png
    ├── plot2_heatmap.png
    ├── plot3_degradation.png
    ├── plot4_raceprogress.png
    ├── plot5_simple_lr.png
    ├── plot6_multi_lr.png
    └── plot7_comparison.png

🚀 Quick Start

Option 1 — Google Colab (Recommended)

  1. Click the Open in Colab badge above
  2. Upload f1_strategy_dataset_v4.csv when prompted
  3. Run all cells (Runtime → Run all)

Option 2 — Local Setup

# Clone the repo
git clone https://github.com/YOUR_USERNAME/f1-lap-time-regression.git
cd f1-lap-time-regression

# Install dependencies
pip install pandas numpy matplotlib seaborn plotly scikit-learn

# Launch Jupyter
jupyter notebook F1_LapTime_Regression.ipynb

📦 Dependencies

pandas >= 1.5
numpy >= 1.23
matplotlib >= 3.6
seaborn >= 0.12
plotly >= 5.11
scikit-learn >= 1.2

🔍 Key Insights

  • Tyre degradation is real and measurable — lap times increase linearly with tyre age, most strongly on SOFT compounds
  • Multiple features significantly reduce prediction error vs a single-variable model
  • Data leakage pitfallLapTime_Delta and Cumulative_Degradation were excluded as they're derived directly from the target variable
  • Safety car laps and weather transitions are the largest residual outliers — future models should flag these

🔮 Future Improvements

  • Add TrackTemp_C — grip varies massively with track temperature
  • Add FuelLoad_kg — fuel burn improves lap time by ~0.03s/lap
  • Add IsSafetyCar flag — removes major outliers
  • Try Polynomial Regression — tyre degradation is non-linear
  • Try Random Forest / XGBoost — handles interaction effects
  • Integrate FastF1 API for real-time live race predictions

👤 Author

Devendra Agrawal
LinkedIn | GitHub | Kaggle


📄 License

This project is licensed under the MIT License.
Dataset used for educational purposes only.


Built as part of a Machine Learning coursework assignment.
F1 data analysis | Python | scikit-learn | 2025

About

Multiple Linear Regression (MLR) is a technique in Data Analytics that models the relationship between one dependent variable and multiple independent variables. It estimates how each feature influences the outcome, enabling prediction, trend analysis, and data-driven decision making.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors