F1 Telemetry Driver Performance Analysis

A machine learning pipeline that models F1 qualifying lap time deltas between driver pairs using telemetry data. The project uses telemetry-derived features from the same qualifying session to estimate lap time differences and identify which driver was faster.

Overview

Given two drivers from the same qualifying session, the model answers: based on how each driver drove their fastest lap, who was faster and by how much?

The pipeline:

Ingests raw F1 telemetry via the FastF1 API
Cleans and aligns multi-driver telemetry to a common distance grid
Engineers speed, throttle, and braking features for each lap
Builds a labeled dataset of pairwise driver comparisons across multiple race weekends
Trains a Random Forest regression model to estimate lap time delta between driver pairs
Evaluates performance using grouped train/test splits by race weekend

Overview of the Approach

For each qualifying session, the pipeline finds drivers with valid clean laps. For every driver pairing, it:

extracts the fastest clean lap for each driver
retrieves and cleans the telemetry data
aligns both drivers' telemetry to a shared distance grid
computes per-lap features and pairwise feature differences
records the actual lap time delta as the target

This project is best interpreted as a same-session telemetry analysis and performance modeling tool, rather than a forward-looking predictor across different sessions.

Project Structure

f1/
├── src/
│   ├── data_loader.py       # FastF1 lap retrieval and telemetry extraction
│   ├── preprocess.py        # Telemetry cleaning and grid alignment
│   ├── features.py          # Per-lap and pairwise feature engineering
│   ├── pipeline.py          # Builds one comparison row from a qualifying session
│   ├── visualize.py         # Speed, throttle, brake, and delta plots
│   ├── baseline_model.py    # Optional baseline utilities
│   └── checking_data.py     # Model training and evaluation
├── outputs/
│   ├── tables/              # Generated dataset CSVs
│   └── figures/             # Generated plots
├── run_compare.py           # Main script to build the dataset
├── requirements.txt
└── README.md

Features

All pairwise features are computed as Driver A minus Driver B differences.

Feature	Description
`mean_speed_diff`	Difference in average speed over the lap
`top_speed_diff`	Difference in maximum speed
`min_speed_diff`	Difference in minimum speed
`std_speed_diff`	Difference in speed variability
`full_throttle_diff`	Difference in percentage of lap at full throttle (>95%)
`pct_brake_diff`	Difference in percentage of lap spent braking
`avg_throttle_diff`	Difference in average throttle application
`brake_event_diff`	Difference in number of distinct braking zones

Target

The regression target is the lap time delta between the two drivers:

target_delta_s = lap_time_A_s - lap_time_B_s

Negative values mean Driver A was faster
Positive values mean Driver B was faster

The model also supports a derived winner prediction task by checking the sign of the predicted delta.

Model and Evaluation

The current model is a RandomForestRegressor trained on telemetry-based pairwise features.

To avoid leaking information between rows from the same race weekend, evaluation uses GroupShuffleSplit, grouping by qualifying session / event. This ensures that entire sessions are held out together during testing.

Reported metrics include:

MAE — mean absolute error in seconds
RMSE — root mean squared error
R² — variance explained
Winner prediction accuracy — percentage of driver pairs where the model correctly predicts who was faster

Example grouped split output from the current version:

MAE: 0.2228 s
RMSE: 0.3222 s
R²: 0.7229
Winner prediction accuracy: 98.95%

Setup

Requirements

Python 3.9+
See requirements.txt for the full dependency list

Installation

git clone https://github.com/Esun4/f1.git
cd f1
python -m venv .venv
source .venv/bin/activate      # Windows: .venv\Scripts\activate
pip install -r requirements.txt

Cache Setup

FastF1 caches session data locally to avoid repeated API calls. Create a cache directory and update the path in run_compare.py:

fastf1.Cache.enable_cache("path/to/your/cache")

Usage

Build the Dataset

python run_compare.py

This loops through the specified qualifying rounds, generates all valid driver pairings, and writes the comparison dataset to outputs/tables/dataset.csv.

Train and Evaluate the Model

python src/checking_data.py

This prints grouped train/test split information, regression metrics, and winner prediction accuracy on the held-out test sessions.

Current Results

On held-out qualifying sessions, the current version produces strong same-session performance:

MAE: 0.2228 s
RMSE: 0.3222 s
R²: 0.7229
Winner prediction accuracy: 98.95%

These results show that telemetry-derived lap features capture a large amount of the variation in within-session lap time differences.

Limitations

This project currently uses telemetry features and lap-time targets derived from the same qualifying session. That makes it better suited for same-session performance modeling and telemetry analysis than for true future-session forecasting.

A future extension would be to restructure the pipeline so that earlier-session telemetry is used to predict later-session performance.

Dependencies

Key packages:

fastf1 — F1 session and telemetry data
pandas, numpy — data manipulation and numerical operations
scikit-learn — model training and evaluation
matplotlib — telemetry and delta visualizations

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
config		config
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
results.txt		results.txt
run_compare.py		run_compare.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

F1 Telemetry Driver Performance Analysis

Overview

Overview of the Approach

Project Structure

Features

Target

Model and Evaluation

Setup

Requirements

Installation

Cache Setup

Usage

Build the Dataset

Train and Evaluate the Model

Current Results

Limitations

Dependencies

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

F1 Telemetry Driver Performance Analysis

Overview

Overview of the Approach

Project Structure

Features

Target

Model and Evaluation

Setup

Requirements

Installation

Cache Setup

Usage

Build the Dataset

Train and Evaluate the Model

Current Results

Limitations

Dependencies

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages