Working on LLVM for HPC โ loop transforms, vectorization passes, and cost model tuning for AArch64 (Neoverse V2 / SVE2).
Learning MLIR to work on ML compiler infrastructure โ currently tracing how ML kernels lower through Linalg/Affine to LLVM backend.
Current focus
- Middle-end: loop vectorizer, SLP vectorizer, TTI cost models
- Target: AArch64 + SVE2 (Neoverse V2)
- Exploring: torch-mlir, Linalg dialect, end-to-end kernel lowering
Background
- Deep learning fundamentals โ PyTorch, model internals
- CS grad โ compiler engineering via HPC
