class AnayDongre:
role = "ML Engineer"
code = ["Python", "Rust", "C++", "CUDA", "TypeScript"]
focus = ["LLM Inference", "Training Infrastructure", "Open Source"]
research = ["Speculative Decoding", "KV Cache Optimization", "PEFT"]
kaggle = "3x Master"
school = "MSCS @ Cal Poly Pomona (Dec 2026)"
def current_work(self):
return [
"Contributing to sglang, nano-vllm, CocoIndex",
"Building EigenTune — SVD-based PEFT (pip install eigentune)",
"Competing in Kaggle NVIDIA Nemotron Reasoning Challenge",
]I contribute code to ML infrastructure projects that other engineers depend on.
CocoIndex — PR #1010 merged. Built SplitBySeparators as a native Rust crate with PyO3 bindings. GIL-free text splitting for ETL hot paths where every millisecond matters.
sglang — Active contributor. LLM serving framework for structured generation and fast inference.
nano-vllm — Contributing to lightweight LLM inference engine. 12.9k stars.
Hugging Face Hub · PyTorch-Lightning — Bug fixes, patches, examples.
|
EigenTune — PEFT via SVD Decomposes weight matrices via SVD. Freezes U/V. Learns 4-bit scalars. 99.5% fewer trainable params, matching full fine-tuning. Ships with CUDA kernels, GGUF/ONNX export, and HuggingFace Trainer integration. |
Nano LLAMA — from scratch 221M-param decoder-only transformer. RMSNorm, RoPE, SwiGLU, mixed-precision, gradient checkpointing. Trained on a single 4GB GPU. Reproducible scripts. No cloud budget required. |
Transactional KV Caching for Speculative Decoding under Paged KV Memory — TechRxiv (IEEE), 2026
Draft tokens inflate KV cache pages and fragment memory at high load. TransKV treats speculative writes as a transaction — buffer drafts, commit only accepted tokens, roll back the rest. Exact output equivalence, better throughput.
Blockchain-Based E-Voting with Proof-of-Work and ML — IET Blockchain, 2023 (peer-reviewed)
AWS ML Specialty · 3x Kaggle Master · Codeforces · Akuna Capital Trading Competition



