Learning CUDA properly: from fundamentals to real-world GPU systems.
This repo documents my journey of learning CUDA from scratch.
Not prompt-generated. Not copy-paste.
Just real understanding, built step by step.
There is too much low-effort, AI-generated content around CUDA.
Most resources are either too academic or too shallow.
This repo is different.
It focuses on:
- understanding how GPUs actually work
- building intuition before writing code
- connecting CUDA to real-world systems (Kubernetes, AI workloads, etc.)
The project is split into two main parts:
This is where everything starts.
It covers the fundamentals:
- GPU architecture
- memory model
- compute capability
- performance fundamentals
- hardware evolution (up to 2026 architectures)
Each section is structured step by step.
Each folder contains: notes (clear explanations), visual summaries and structured learning progression.
This is where theory turns into practice.
- No noise, no fluff
- No blind copy from docs
- Built like an engineer, not a tutorial
Everything here is written to answer one question:
“Do I actually understand what’s happening?”
This project is supported by JetBrains. JetBrains provides professional developer tools that I actively use for CUDA development, experimentation, and documentation.
This project is also supported by Manning Publishing. They provide high-quality technical books that I use to deepen my understanding of CUDA, GPU systems, and parallel computing.
Special thanks to Manning Publishing for providing CUDA for Deep Learning (by Elliot Arledge).

