Summary
llama-finetune fails on all modern transformer architectures (LLaMA 3.2, Qwen2.5, SmolLM2) due to three interconnected issues in the backward pass infrastructure. This affects builds b7405 through b7717 (current master).
Environment
- Build: b7717 (537d424), Linux x86_64, GCC 13.3.0
- Model:
Llama-3.2-1B-Instruct-f32.gguf
Failures
| # |
Trigger |
Location |
Cause |
| 1 |
Default |
ggml.c:6928 |
GGML_OP_SET_ROWS not in allowed view-ops, no backward impl |
| 2 |
--flash-attn |
ggml.c:ggml_compute_backward() |
GGML_OP_FLASH_ATTN_EXT missing backward case |
| 3 |
Disable FA |
ggml.c:6763 |
Graph node overflow from standard attention |
Reproduction
./build/bin/llama-finetune \
-m Llama-3.2-1B-Instruct-f32.gguf \
-f train.txt -c 256 --epochs 1
Related
- #15279: Same view_src assertion (closed COMPLETED)
- #15090: Same graph overflow (closed STALE)