-
Notifications
You must be signed in to change notification settings - Fork 51
Description
Dear author,
I encountered an issue while implementing the model. I attempted to change the resolution for training, but it did not proceed successfully and resulted in the following error. I would appreciate your guidance on resolving this. Thank you!
Traceback (most recent call last):
File "/home/avlab/FontDiffuser-main/train.py", line 272, in
main()
~~~~^^
File "/home/avlab/FontDiffuser-main/train.py", line 189, in main
noise_pred, offset_out_sum = model(
~~~~~^
x_t=noisy_target_images,
^^^^^^^^^^^^^^^^^^^^^^^^
...<2 lines>...
content_images=content_images,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
content_encoder_downsample_size=args.content_encoder_downsample_size)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/avlab/anaconda3/envs/fontdiffuser/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "/home/avlab/anaconda3/envs/fontdiffuser/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
return forward_call(*args, **kwargs)
File "/home/avlab/FontDiffuser-main/src/model.py", line 49, in forward
out = self.unet(
x_t,
...<2 lines>...
content_encoder_downsample_size=content_encoder_downsample_size,
)
File "/home/avlab/anaconda3/envs/fontdiffuser/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "/home/avlab/anaconda3/envs/fontdiffuser/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
return forward_call(*args, **kwargs)
File "/home/avlab/FontDiffuser-main/src/modules/unet.py", line 278, in forward
sample, offset_out = upsample_block(
~~~~~~~~~~~~~~^
hidden_states=sample,
^^^^^^^^^^^^^^^^^^^^^
...<3 lines>...
encoder_hidden_states=encoder_hidden_states[2],
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "/home/avlab/anaconda3/envs/fontdiffuser/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "/home/avlab/anaconda3/envs/fontdiffuser/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
return forward_call(*args, **kwargs)
File "/home/avlab/FontDiffuser-main/src/modules/unet_blocks.py", line 572, in forward
offset = sc_inter_offset(res_hidden_states, style_content_feat)
File "/home/avlab/anaconda3/envs/fontdiffuser/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "/home/avlab/anaconda3/envs/fontdiffuser/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
return forward_call(*args, **kwargs)
File "/home/avlab/FontDiffuser-main/src/modules/attention.py", line 305, in forward
style_content_hidden_states = self.gnorm_s(style_content_hidden_states)
File "/home/avlab/anaconda3/envs/fontdiffuser/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "/home/avlab/anaconda3/envs/fontdiffuser/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
return forward_call(*args, **kwargs)
File "/home/avlab/anaconda3/envs/fontdiffuser/lib/python3.13/site-packages/torch/nn/modules/normalization.py", line 313, in forward
return F.group_norm(input, self.num_groups, self.weight, self.bias, self.eps)
~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/avlab/anaconda3/envs/fontdiffuser/lib/python3.13/site-packages/torch/nn/functional.py", line 2965, in group_norm
return torch.group_norm(
~~~~~~~~~~~~~~~~^
input, num_groups, weight, bias, eps, torch.backends.cudnn.enabled
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
RuntimeError: Expected weight to be a vector of size equal to the number of channels in input, but got weight of shape [128] and input of shape [4, 256, 12, 12]