ikawrakow / ik_llama.cpp Public

Notifications You must be signed in to change notification settings
Fork 315
Star 2.5k

Code
Issues 50
Pull requests 14
Discussions
Actions
Projects
Wiki
Security and quality
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security and quality
Insights

Pull requests: ikawrakow/ik_llama.cpp

Labels 13 Milestones 0

New pull request New

14 Open 1,143 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Fix Qwen3.6-MoE low MTP acceptance rate

#1815 opened May 17, 2026 by ikawrakow Owner

Loading…

CUDA: add auto offload threshold for MoE expert ops

#1813 opened May 17, 2026 by joelfarthing Contributor • Draft

2 of 4 tasks

Initial refactoring of the spec in server-context

#1808 opened May 15, 2026 by SamuelOliveirads Collaborator

Loading…

fix(server): reset chat parser on slot reuse to prevent crash (#1763)

#1794 opened May 13, 2026 by gapeleon Contributor

Loading…

Extend expiring logit bias to other sampling parameters

#1770 opened May 10, 2026 by dungquixote42 Contributor

Loading…

2 of 4 tasks

Slightly expand the usage of VNNI256

#1764 opened May 9, 2026 by XZiar Contributor

Loading…

2 of 4 tasks

A GGUF editor, which can be use to duplicate or delete layers in Qwen3.5 / Qwen Coder Next or whatever but may not run in 1 shot.

#1746 opened May 6, 2026 by FNsi • Draft

1 of 3 tasks

runtime : add --run-time-repack auto mode for swap-bound MoE safety

#1738 opened May 4, 2026 by AndrewMoryakov Contributor

Loading…

2 of 4 tasks

Change signature of llama_set_draft_input_hidden_state

#1727 opened May 3, 2026 by ikawrakow Owner

Loading…

convert_hf_to_gguf: add Qwen3.5 / Qwen3.6 / Qwen3-Next support

#1654 opened Apr 18, 2026 by markaalonzo Contributor • Draft

5 of 7 tasks

Alternative graph parallel for MiniMax-M2

#1644 opened Apr 16, 2026 by ikawrakow Owner

Loading…

Add reuse property to ggml_cgraph

#1617 opened Apr 11, 2026 by ikawrakow Owner

Loading…

Mamba-2 + Nemotron-H MoE backport (Phase 3.x)

#1593 opened Apr 6, 2026 by AIdevsmartdata

Loading…

5 tasks

Add GLM 5 MTP

#1513 opened Mar 25, 2026 by SamuelOliveirads Collaborator

Loading…

ProTip! Updated in the last three days: updated:>2026-05-14.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!