-
Notifications
You must be signed in to change notification settings - Fork 315
Pull requests: ikawrakow/ik_llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
CUDA: add auto offload threshold for MoE expert ops
#1813
opened May 17, 2026 by
joelfarthing
Contributor
•
Draft
2 of 4 tasks
Initial refactoring of the spec in server-context
#1808
opened May 15, 2026 by
SamuelOliveirads
Collaborator
Loading…
fix(server): reset chat parser on slot reuse to prevent crash (#1763)
#1794
opened May 13, 2026 by
gapeleon
Contributor
Loading…
Extend expiring logit bias to other sampling parameters
#1770
opened May 10, 2026 by
dungquixote42
Contributor
Loading…
2 of 4 tasks
Slightly expand the usage of VNNI256
#1764
opened May 9, 2026 by
XZiar
Contributor
Loading…
2 of 4 tasks
runtime : add
--run-time-repack auto mode for swap-bound MoE safety
#1738
opened May 4, 2026 by
AndrewMoryakov
Contributor
Loading…
2 of 4 tasks
Change signature of llama_set_draft_input_hidden_state
#1727
opened May 3, 2026 by
ikawrakow
Owner
Loading…
convert_hf_to_gguf: add Qwen3.5 / Qwen3.6 / Qwen3-Next support
#1654
opened Apr 18, 2026 by
markaalonzo
Contributor
•
Draft
5 of 7 tasks
Mamba-2 + Nemotron-H MoE backport (Phase 3.x)
#1593
opened Apr 6, 2026 by
AIdevsmartdata
Loading…
5 tasks
ProTip!
Updated in the last three days: updated:>2026-05-14.