-
Notifications
You must be signed in to change notification settings - Fork 15.1k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
scripts: ini_to_opencode.py
python
python script changes
script
Script related
#19938
opened Feb 26, 2026 by
am17an
Loading…
fix dots.ocr: correct RoPE sections and FFN tensor mapping
examples
python
python script changes
#19936
opened Feb 26, 2026 by
anthony-maio
Loading…
1 of 2 tasks
llama: disable repack by default
devops
improvements to build systems and github actions
#19932
opened Feb 26, 2026 by
am17an
Loading…
tool parser: add GigaChatV3/3.1 models support in PEG format
testing
Everything test related
#19931
opened Feb 26, 2026 by
Mishusha
Loading…
metal: add CONV_3D
Apple Metal
https://en.wikipedia.org/wiki/Metal_(API)
ggml
changes relating to the ggml tensor library for machine learning
#19927
opened Feb 26, 2026 by
Ra5hidIslam
Loading…
server : support multiple model aliases via comma-separated --alias
examples
python
python script changes
server
#19926
opened Feb 26, 2026 by
ServeurpersoCom
Loading…
ggml : fix AMX and add batched support
ggml
changes relating to the ggml tensor library for machine learning
#19925
opened Feb 26, 2026 by
angt
Loading…
ggml-zendnn : update code for latest ZenDNN API
documentation
Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
#19923
opened Feb 26, 2026 by
z-vishal
Loading…
llama/ggml: multi-GPU pipeline parallelism (xdev host staging) + faster model loading
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#19922
opened Feb 26, 2026 by
mxxm-t
Loading…
[SYCL] Replace the magic nunber 768 by max work group size to support iGPU
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#19920
opened Feb 26, 2026 by
arthw
Loading…
ggml-cuda: add mem check for fusion
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
vendors: update miniaudio library to 0.11.24
python
python script changes
script
Script related
#19914
opened Feb 26, 2026 by
data-man
Loading…
test-backend-ops: allow loading tests from JSON and parsing model operators into JSON
examples
testing
Everything test related
#19896
opened Feb 25, 2026 by
0cc4m
Loading…
Mirroring /v1/responses to /responses
examples
server
#19873
opened Feb 25, 2026 by
samikama
Loading…
[ggml-quants] Add memsets and other fixes for IQ quants
ggml
changes relating to the ggml tensor library for machine learning
#19861
opened Feb 24, 2026 by
bartowski1182
Loading…
server : add default-model preset and fallback logic
examples
server
#19855
opened Feb 24, 2026 by
mikhail-shevtsov-wiregate
Loading…
ggml-webgpu: Support non-contiguous changes relating to the ggml tensor library for machine learning
testing
Everything test related
src0 and overlapping src0/src1 in binary ops
ggml
#19850
opened Feb 24, 2026 by
yomaytk
Loading…
opencl: add optimized q4_1 mm kernel for adreno
ggml
changes relating to the ggml tensor library for machine learning
OpenCL
Issues specific to the OpenCL backend
common : refactor cache to use hierarchical directory layout
#19828
opened Feb 23, 2026 by
angt
Loading…
implemented max pooling for embeddings
examples
python
python script changes
server
#19812
opened Feb 22, 2026 by
lorenzocesconetto
Loading…
Previous Next
ProTip!
What’s not been updated in a month: updated:<2026-01-26.