Thanks to the author's great work. The noise prior in the inference code is used in the following code, but I am wondering why the noise is used in this formula. Is there any theoretical analysis on this? Since if I set it 0, for single video training, it seems that the lora could not correctly learn the motion.

Thanks to the author's great work. The noise prior in the inference code is used in the following code, but I am wondering why the noise is used in this formula. Is there any theoretical analysis on this? Since if I set it 0, for single video training, it seems that the lora could not correctly learn the motion.