GitHub - RentedNoodle/llama.den: Den experimental kernel forge — raw inline PTX tensor core path for Blackwell SM120. OMMA.SF.16864 cubins, SASS verification, fragment mapping. Where kernels are proven before promotion to den-nv.

GGUF inference engine for Blackwell SM120. NVFP4 tensor core path via OMMA.SF.16864 — the native 4-bit instruction instead of the DP4A fallback.

Not really a llama.cpp fork at this point. The NVFP4 stack, MoE dispatch, SSM kernels, governor FSM, and RT core integration are all custom. The upstream inheritance is basically just the GGML type system.

Build

Requires CUDA 12.8.

cmake -B build -G Ninja -DCMAKE_BUILD_TYPE=Release -DGGML_CUDA=ON -DCMAKE_CUDA_ARCHITECTURES="120a" cmake --build build -j$(nproc)

Quick test

cuobjdump --dump-sass build/ggml/src/libggml.so | grep -c "OMMA.SF.16864"

License

MIT.

Name		Name	Last commit message	Last commit date
Latest commit History 4,766 Commits
.aider.tags.cache.v4		.aider.tags.cache.v4
.claude		.claude
.devops		.devops
.github		.github
ci		ci
cmake		cmake
common		common
docker		docker
docs		docs
examples		examples
ggml		ggml
gguf-py		gguf-py
github-data		github-data
grammars		grammars
include		include
media		media
models		models
pocs		pocs
prompts		prompts
requirements		requirements
scripts		scripts
spm-headers		spm-headers
src		src
tests		tests
tools		tools
vendor		vendor
.aider.chat.history.md		.aider.chat.history.md
.aider.input.history		.aider.input.history
.dockerignore		.dockerignore
.ecrc		.ecrc
.editorconfig		.editorconfig
.flake8		.flake8
.gitignore		.gitignore
.gitmodules		.gitmodules
.mailmap		.mailmap
.pre-commit-config.yaml		.pre-commit-config.yaml
AUTHORS		AUTHORS
CMakeLists.txt		CMakeLists.txt
CMakePresets.json		CMakePresets.json
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Package.swift		Package.swift
README.md		README.md
check_arch.py		check_arch.py
check_attn.py		check_attn.py
check_names.py		check_names.py
check_small.py		check_small.py
check_small_gguf.py		check_small_gguf.py
convert_hf_to_gguf.py		convert_hf_to_gguf.py
convert_hf_to_gguf_update.py		convert_hf_to_gguf_update.py
convert_imatrix_gguf_to_dat.py		convert_imatrix_gguf_to_dat.py
convert_llama_ggml_to_gguf.py		convert_llama_ggml_to_gguf.py
convert_lora_to_gguf.py		convert_lora_to_gguf.py
cuda-keyring_1.1-1_all.deb		cuda-keyring_1.1-1_all.deb
den_tma_probe		den_tma_probe
docker-bake.hcl		docker-bake.hcl
docker-bake.override.hcl		docker-bake.override.hcl
dump_tensors.py		dump_tensors.py
flake.lock		flake.lock
flake.nix		flake.nix
gdb_cmds.txt		gdb_cmds.txt
mypy.ini		mypy.ini
poetry.lock		poetry.lock
probabilities.txt		probabilities.txt
pyproject.toml		pyproject.toml
pyrightconfig.json		pyrightconfig.json
requirements.txt		requirements.txt
test-function-calls.md		test-function-calls.md
tmpxft_00003fd6_00000000-1_stdin.compute_120.cpp1.ii		tmpxft_00003fd6_00000000-1_stdin.compute_120.cpp1.ii
tmpxft_00003fd6_00000000-1_stdin.compute_120.cudafe1.c		tmpxft_00003fd6_00000000-1_stdin.compute_120.cudafe1.c
tmpxft_00003fd6_00000000-1_stdin.compute_120.cudafe1.gpu		tmpxft_00003fd6_00000000-1_stdin.compute_120.cudafe1.gpu
tmpxft_00003fd6_00000000-1_stdin.compute_120.cudafe1.stub.c		tmpxft_00003fd6_00000000-1_stdin.compute_120.cudafe1.stub.c
tmpxft_00003fd6_00000000-1_stdin.compute_120.ptx		tmpxft_00003fd6_00000000-1_stdin.compute_120.ptx
tmpxft_00003fd6_00000000-1_stdin.compute_120a.cpp1.ii		tmpxft_00003fd6_00000000-1_stdin.compute_120a.cpp1.ii
tmpxft_00003fd6_00000000-1_stdin.compute_120a.cudafe1.c		tmpxft_00003fd6_00000000-1_stdin.compute_120a.cudafe1.c
tmpxft_00003fd6_00000000-1_stdin.compute_120a.cudafe1.cpp		tmpxft_00003fd6_00000000-1_stdin.compute_120a.cudafe1.cpp
tmpxft_00003fd6_00000000-1_stdin.compute_120a.cudafe1.gpu		tmpxft_00003fd6_00000000-1_stdin.compute_120a.cudafe1.gpu
tmpxft_00003fd6_00000000-1_stdin.compute_120a.cudafe1.stub.c		tmpxft_00003fd6_00000000-1_stdin.compute_120a.cudafe1.stub.c
tmpxft_00003fd6_00000000-1_stdin.compute_120a.ptx		tmpxft_00003fd6_00000000-1_stdin.compute_120a.ptx
tmpxft_00003fd6_00000000-1_stdin.cpp4.ii		tmpxft_00003fd6_00000000-1_stdin.cpp4.ii
tmpxft_00003fd6_00000000-1_stdin.fatbin		tmpxft_00003fd6_00000000-1_stdin.fatbin
tmpxft_00003fd6_00000000-1_stdin.fatbin.c		tmpxft_00003fd6_00000000-1_stdin.fatbin.c
tmpxft_00003fd6_00000000-1_stdin.module_id		tmpxft_00003fd6_00000000-1_stdin.module_id

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Build

Quick test

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Build

Quick test

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages