Skip to content

Add timm_finetune module: iNaturalist-pretrained backbone fine-tuning (ConvNeXt-L / EVA-02)#51

Open
trgardos wants to merge 1 commit into
multitask-full-speciesfrom
timm-domain-backbone
Open

Add timm_finetune module: iNaturalist-pretrained backbone fine-tuning (ConvNeXt-L / EVA-02)#51
trgardos wants to merge 1 commit into
multitask-full-speciesfrom
timm-domain-backbone

Conversation

@trgardos

Copy link
Copy Markdown
Contributor

What

A self-contained module finetuning/timm_finetune/ to fine-tune iNaturalist-pretrained
(timm) backbones
— the Tier 1.1 "domain backbone" lever, the single highest-leverage untaken
item toward higher Kaggle-2022 macro-F1. No changes to SWIN_finetuning_advanced.py or its
configs
(per request — separate folder, own configs/scripts).

train.py is forked from the SWIN trainer with a timm path added; the backbone-agnostic
machinery (mixup collator, MixupTrainer, EMA, custom eval loop, metrics, balanced softmax, and
the full-species multi-task heads) is reused unchanged. Backbone is chosen via
model.backbone_type: hf|timm.

Key design

  • build_backbone()timm.create_model(name, num_classes=0) (pooled features), with hf-hub
    support so iNat21 checkpoints not in timm's registry load by repo id; HF SWIN path preserved.
  • Generalized MultiTaskModel (renamed from MultiTaskSwinModel) — _pooled() returns the
    pooled vector for either backbone; grad-checkpointing routes to timm set_grad_checkpointing.
  • Preprocessing (size/mean/std) from timm.data.resolve_model_data_config for timm; backbone
    metadata saved into config.json for rebuild. Full fine-tune only; timm + single-task/ArcFace
    raise a clear error.

Backbones (what actually loads)

The plan's timm/convnextv2_base.inat21_384 has no loadable repo (BBracke/... → 404). The
established iNat21 timm checkpoints, verified to load via hf-hub:

  • convnext_large_inat_384_2gpu.yml (default) — timm/convnext_large_mlp.laion2b_ft_augreg_inat21
    @384 (ConvNeXt-L; closest to the ConvNeXt/384 intent).
  • eva02_large_inat_336_2gpu.ymltimm/eva02_large_patch14_clip_336.merged2b_ft_inat21 @336.

Both: multi-task on the full 15.5k scientificNameEncoded species (== leaderboard metric),
balanced softmax, EMA, medium aug, TTA wired, 2-GPU effective batch 128.

Validation

py_compile + import; both configs parse; end-to-end main() smoke with real ConvNeXt-L
weights: backbone_type=timm → loaded (1536-d) → 272 families / 2564 genera / 15500 species
train + eval + EMA ran; the only failure was the checkpoint write hitting /tmp disk space
(real runs write to /projectnb), not a code path.

Setup / notes

  • Needs pip install --user timm (timm 1.0.27 used).
  • Launch: EMAIL=tgardos@bu.edu NGPUS=2 RUN_PREFIX=CONVNEXT_L_INAT_384 CONFIG=configs/convnext_large_inat_384_2gpu.yml SEEDS="0" bash submit.sh
  • Inference/submission not wired for timm yet (SWIN prediction.py is SWIN-specific) — a small
    timm predictor using the saved config.json is a follow-up.
  • Stacked on multitask-full-species (PR Multi-task: species head targets full Kaggle species (15.5k), not epithet (6.9k) #50); retarget to main after that merges.

🤖 Generated with Claude Code

…ning

New self-contained module finetuning/timm_finetune/ for the Tier 1.1 lever —
fine-tune iNaturalist-pretrained backbones instead of ImageNet-22k SWIN. The
SWIN trainer and its configs are untouched.

train.py is forked from SWIN_finetuning_advanced.py with a timm backbone path
added (build_backbone with hf-hub support, generalized MultiTaskModel wrapper,
timm-derived preprocessing, backbone metadata saved to config.json); the
backbone-agnostic machinery (mixup collator, MixupTrainer, EMA, eval loop,
metrics, balanced softmax, full-species multi-task heads) is reused unchanged.
Backbone is selected via model.backbone_type: hf|timm. Full fine-tune only;
timm + single-task/ArcFace intentionally raise a clear error.

Configs (multi-task on full 15.5k scientificNameEncoded == Kaggle metric,
balanced softmax, EMA, medium aug, TTA wired, 2-GPU eff batch 128):
- convnext_large_inat_384_2gpu.yml (default): timm/convnext_large_mlp.
  laion2b_ft_augreg_inat21 @384. NB the plan's convnextv2_base.inat21_384 has
  no loadable repo (404); this ConvNeXt-L iNat21 is the closest established one.
- eva02_large_inat_336_2gpu.yml: timm/eva02_large_patch14_clip_336.
  merged2b_ft_inat21 @336.

Plus train.sh / submit.sh (mirror the SWIN launchers) and README.

Validated: py_compile, import, both configs parse, and an end-to-end main()
smoke (real ConvNeXt-L weights -> 272 families / 2564 genera / 15500 species,
train + eval + EMA ran; only the checkpoint write failed on /tmp disk space,
not a code path). Needs `pip install --user timm`. timm inference/submission
(SWIN prediction.py is SWIN-specific) is a documented follow-up.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@trgardos trgardos requested a review from Farid-Karimli June 12, 2026 14:54
@trgardos

Copy link
Copy Markdown
Contributor Author

First run is ongoing and recorded at https://wandb.ai/gardoslab/herbdl/runs/convnext_l_inat_384_seed0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant