Skip to content

Pull requests: ikawrakow/ik_llama.cpp

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Fix Qwen3.6-MoE low MTP acceptance rate
#1815 opened May 17, 2026 by ikawrakow Owner Loading…
CUDA: add auto offload threshold for MoE expert ops
#1813 opened May 17, 2026 by joelfarthing Contributor Draft
2 of 4 tasks
Initial refactoring of the spec in server-context
#1808 opened May 15, 2026 by SamuelOliveirads Collaborator Loading…
Extend expiring logit bias to other sampling parameters
#1770 opened May 10, 2026 by dungquixote42 Contributor Loading…
2 of 4 tasks
Slightly expand the usage of VNNI256
#1764 opened May 9, 2026 by XZiar Contributor Loading…
2 of 4 tasks
runtime : add --run-time-repack auto mode for swap-bound MoE safety
#1738 opened May 4, 2026 by AndrewMoryakov Contributor Loading…
2 of 4 tasks
Change signature of llama_set_draft_input_hidden_state
#1727 opened May 3, 2026 by ikawrakow Owner Loading…
convert_hf_to_gguf: add Qwen3.5 / Qwen3.6 / Qwen3-Next support
#1654 opened Apr 18, 2026 by markaalonzo Contributor Draft
5 of 7 tasks
Alternative graph parallel for MiniMax-M2
#1644 opened Apr 16, 2026 by ikawrakow Owner Loading…
Add reuse property to ggml_cgraph
#1617 opened Apr 11, 2026 by ikawrakow Owner Loading…
Mamba-2 + Nemotron-H MoE backport (Phase 3.x)
#1593 opened Apr 6, 2026 by AIdevsmartdata Loading…
5 tasks
Add GLM 5 MTP
#1513 opened Mar 25, 2026 by SamuelOliveirads Collaborator Loading…
ProTip! Updated in the last three days: updated:>2026-05-14.