I try to train VQGAN on UCF-101 dataset with 4 A100s, and 24 samples for each device, recontruction and perceptual loss could converge normally, until the discriminator is opened in 10k steps. In addtion, the commitment loss diverge from the start time, How can I fix it?


I try to train VQGAN on UCF-101 dataset with 4 A100s, and 24 samples for each device, recontruction and perceptual loss could converge normally, until the discriminator is opened in 10k steps. In addtion, the commitment loss diverge from the start time, How can I fix it?

