What is the recommended way to add a new base model to TuneOS? #31
-
|
I want to fine-tune a model that isn't in the supported list (e.g. Qwen2 or Phi-4). What's the right way to add support for a new base model in TuneOS? Do I need to modify the trainer, the config, or just the API model list? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
|
To add a new base model:
That's it no changes to the trainer pipeline itself are needed. |
Beta Was this translation helpful? Give feedback.
To add a new base model:
Add it to
app/api.py— append an entry to_SUPPORTED_MODELSwithname,hf_id, andnotes(VRAM estimate).Test compatibility — run a quick LoRA injection test using
trainer/lora.pywith the new model ID. Most causal LM models work out of the box viaAutoModelForCausalLM.Check target modules — different architectures use different attention layer names. For Qwen2 use
["q_proj", "v_proj"], for Phi use["q_proj", "k_proj", "v_proj", "dense"]. Updatelora_cfgdefaults accordingly.Update
docs/supported-models.mdwith VRAM requirements and any special flags (e.g.trust_remote_code=Truefor some models).That's it no changes to the trainer pipeline itself are …