Skip to content
Discussion options

You must be logged in to vote

To add a new base model:

  1. Add it to app/api.py — append an entry to _SUPPORTED_MODELS with name, hf_id, and notes (VRAM estimate).

  2. Test compatibility — run a quick LoRA injection test using trainer/lora.py with the new model ID. Most causal LM models work out of the box via AutoModelForCausalLM.

  3. Check target modules — different architectures use different attention layer names. For Qwen2 use ["q_proj", "v_proj"], for Phi use ["q_proj", "k_proj", "v_proj", "dense"]. Update lora_cfg defaults accordingly.

  4. Update docs/supported-models.md with VRAM requirements and any special flags (e.g. trust_remote_code=True for some models).

That's it no changes to the trainer pipeline itself are …

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by SahilKumar75
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
1 participant