Hi, thank you for open-sourcing DualCodec.
Although the paper does not provide the full loss formula explicitly, from the code I understood the total loss to be approximately:
L_total = 15 * L_distill + 15 * L_mel + 0.25 * L_sem_commit + L_sem_codebook + 0.25 * L_ac_commit + L_ac_codebook + L_g + 2 * L_feat [+ 15 * L_sem_spec]
I wanted to ask how these loss weights were determined. They do not seem to be chosen simply to match the scale of each loss term.
In particular, why was the distill loss weight set to 15, and what happens if this weight is set lower or higher?
Thank you.
Hi, thank you for open-sourcing DualCodec.
Although the paper does not provide the full loss formula explicitly, from the code I understood the total loss to be approximately:
L_total = 15 * L_distill + 15 * L_mel + 0.25 * L_sem_commit + L_sem_codebook + 0.25 * L_ac_commit + L_ac_codebook + L_g + 2 * L_feat [+ 15 * L_sem_spec]I wanted to ask how these loss weights were determined. They do not seem to be chosen simply to match the scale of each loss term.
In particular, why was the distill loss weight set to 15, and what happens if this weight is set lower or higher?
Thank you.