Thanks for your excellent work!
I’ve recently been trying to reproduce your work. However, I noticed that when I set the weight of the alignment loss to 0, the training loss does not seem to change noticeably. I was wondering if you might be willing to share your loss curves for reference.
Below are my current results:
And this is my experiment setting:
torchrun --standalone --nnodes 1 --nproc-per-node 4 vla-scripts/finetune_align.py
--vla_path ckpts/openvla-7b
--vggt_path ckpts/vggt.pt
--data_root_dir /mnt/nas/datasets/Spatial-Forcing/modified_libero_rlds/
--dataset_name libero_spatial_no_noops
--run_root_dir ckpts/training_results/
--pooling_func bilinear
--vla_layers_align 24
--vggt_layers_align -1
--align_loss_type cosine
--align_loss_coeff 0.5
--use_l1_regression True
--use_diffusion False
--use_film False
--use_vlm_norm True
--use_vggt_pe True
--num_images_in_input 2
--use_proprio True
--batch_size 8
--learning_rate 5e-4
--num_steps_before_decay 100000
--max_steps 150005
--save_freq 10000
--save_latest_checkpoint_only True
--merge_lora_during_training False
--image_aug True
--lora_rank 32
--wandb_entity "yuhengyuan"
--wandb_project "spatial-forcing"
--run_id_override "spatial-forcing-7b-finetuned-libero-spatial"
Thanks for your excellent work!
I’ve recently been trying to reproduce your work. However, I noticed that when I set the weight of the alignment loss to 0, the training loss does not seem to change noticeably. I was wondering if you might be willing to share your loss curves for reference.
Below are my current results:
And this is my experiment setting:
torchrun --standalone --nnodes 1 --nproc-per-node 4 vla-scripts/finetune_align.py
--vla_path ckpts/openvla-7b
--vggt_path ckpts/vggt.pt
--data_root_dir /mnt/nas/datasets/Spatial-Forcing/modified_libero_rlds/
--dataset_name libero_spatial_no_noops
--run_root_dir ckpts/training_results/
--pooling_func bilinear
--vla_layers_align 24
--vggt_layers_align -1
--align_loss_type cosine
--align_loss_coeff 0.5
--use_l1_regression True
--use_diffusion False
--use_film False
--use_vlm_norm True
--use_vggt_pe True
--num_images_in_input 2
--use_proprio True
--batch_size 8
--learning_rate 5e-4
--num_steps_before_decay 100000
--max_steps 150005
--save_freq 10000
--save_latest_checkpoint_only True
--merge_lora_during_training False
--image_aug True
--lora_rank 32
--wandb_entity "yuhengyuan"
--wandb_project "spatial-forcing"
--run_id_override "spatial-forcing-7b-finetuned-libero-spatial"