DuckLake Tutorial

A hands-on tutorial for building a local lakehouse with DuckDB, DuckLake, and SQLMesh.

Why This Tutorial?

DuckLake brings lakehouse capabilities (ACID transactions, time travel, Parquet storage) to DuckDB. Combined with SQLMesh for data transformations, you get a lightweight but production-ready data stack that runs entirely on your laptop.

This repo is an unofficial companion to the Tobiko blog post, packaged as a Jupyter notebook for ease of exploration. The blog post covers:

What lakehouses are and why they matter
How DuckLake compares to other open table formats
The layered data architecture (raw → staging → marts)

New to these concepts? Read the blog post first, then come back here to build it yourself.

Prerequisites

Python 3.13+
uv package manager

Setup

# Install dependencies
uv sync

# Launch JupyterLab
uv run jupyter lab

Usage

Open ducklake_tutorial.ipynb
Run all cells

The notebook will:

Download NYC Taxi trip data (~50K rows sampled)
Initialize a DuckLake lakehouse
Run SQLMesh transformations (staging -> dims -> facts)
Query the transformed data

Project Structure

├── ducklake_tutorial.ipynb  # Main tutorial
├── src/ducklake/            # Helper utilities
├── sqlmesh/                 # SQLMesh config & models
├── data/                    # Generated data (gitignored)
└── pyproject.toml           # Dependencies

Data

Uses NYC TLC Trip Record Data (Yellow Taxi, January 2024).

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
docs		docs
sqlmesh		sqlmesh
src/ducklake		src/ducklake
.editorconfig		.editorconfig
.gitignore		.gitignore
.gitlint		.gitlint
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
dprint.json		dprint.json
ducklake_tutorial.ipynb		ducklake_tutorial.ipynb
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
ruff.toml		ruff.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DuckLake Tutorial

Why This Tutorial?

Prerequisites

Setup

Usage

Project Structure

Data

About

Uh oh!

Releases

Packages

Languages

License

jfmcdowell/ducklake-tutorial

Folders and files

Latest commit

History

Repository files navigation

DuckLake Tutorial

Why This Tutorial?

Prerequisites

Setup

Usage

Project Structure

Data

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages