Inspired by 《Learning to Reason in 13 parameters》, use TinyLoRA+GRPO(32 parameters) to fine-tune Qwen2.5-Coder-3B-Instruct(or other models) to accomplish competitive programming.
-
Updated
Mar 11, 2026 - Python
Inspired by 《Learning to Reason in 13 parameters》, use TinyLoRA+GRPO(32 parameters) to fine-tune Qwen2.5-Coder-3B-Instruct(or other models) to accomplish competitive programming.
This repo can work. But I make some updates in a new repo. Please see more in https://github.com/Chi-Shan0707/TinyLoRA-Qwen-Coder
Add a description, image, and links to the tinylora topic page so that developers can more easily learn about it.
To associate your repository with the tinylora topic, visit your repo's landing page and select "manage topics."