Skip to content
View MrAnayDongre's full-sized avatar
🚀
God Speed
🚀
God Speed
  • United States
  • LinkedIn in/anayd

Block or report MrAnayDongre

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
MrAnayDongre/README.md

@rphi's Holopin board

class AnayDongre:
    role      = "ML Engineer"
    code      = ["Python", "Rust", "C++", "CUDA", "TypeScript"]
    focus     = ["LLM Inference", "Training Infrastructure", "Open Source"]
    research  = ["Speculative Decoding", "KV Cache Optimization", "PEFT"]
    
    kaggle    = "3x Master"
    school    = "MSCS @ Cal Poly Pomona (Dec 2026)"
    
    def current_work(self):
        return [
            "Contributing to sglang, nano-vllm, CocoIndex",
            "Building EigenTune — SVD-based PEFT (pip install eigentune)",
            "Competing in Kaggle NVIDIA Nemotron Reasoning Challenge",
        ]

Merged & Shipped

I contribute code to ML infrastructure projects that other engineers depend on.

CocoIndex — PR #1010 merged. Built SplitBySeparators as a native Rust crate with PyO3 bindings. GIL-free text splitting for ETL hot paths where every millisecond matters.

sglang — Active contributor. LLM serving framework for structured generation and fast inference.

nano-vllm — Contributing to lightweight LLM inference engine. 12.9k stars.

Hugging Face Hub · PyTorch-Lightning — Bug fixes, patches, examples.


Things I Built

EigenTune — PEFT via SVD

PyPI

Decomposes weight matrices via SVD. Freezes U/V. Learns 4-bit scalars. 99.5% fewer trainable params, matching full fine-tuning.

Ships with CUDA kernels, GGUF/ONNX export, and HuggingFace Trainer integration.

pip install eigentune

Nano LLAMA — from scratch

221M-param decoder-only transformer. RMSNorm, RoPE, SwiGLU, mixed-precision, gradient checkpointing. Trained on a single 4GB GPU.

Reproducible scripts. No cloud budget required.


Research

Transactional KV Caching for Speculative Decoding under Paged KV MemoryTechRxiv (IEEE), 2026

Draft tokens inflate KV cache pages and fragment memory at high load. TransKV treats speculative writes as a transaction — buffer drafts, commit only accepted tokens, roll back the rest. Exact output equivalence, better throughput.

Blockchain-Based E-Voting with Proof-of-Work and MLIET Blockchain, 2023 (peer-reviewed)


Stack


AWS ML Specialty · 3x Kaggle Master · Codeforces · Akuna Capital Trading Competition

Pinned Loading

  1. eigentune eigentune Public

    Python 1

  2. Machine-Learning-Collection Machine-Learning-Collection Public template

    Repo for Implementing Research Papers & Projects related to Machine Learning

    Python 13 4

  3. cocoindex cocoindex Public

    Forked from cocoindex-io/cocoindex

    Data transformation framework for AI. Ultra performant, with incremental processing.

    Python 1

  4. nano-vllm nano-vllm Public

    Forked from GeeeekExplorer/nano-vllm

    Nano vLLM

    Python