Hi, I run the dpo code with the exact configuration provided for zephyr-7b-beta to reproduce the model zephyr-7b-dpo-full. But I am getting the following error:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument index in method wrapper_CUDA__index_select)
Looks like either the model or data was not sent to cuda. What could be wrong? Thank you.
Hi, I run the dpo code with the exact configuration provided for zephyr-7b-beta to reproduce the model zephyr-7b-dpo-full. But I am getting the following error:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument index in method wrapper_CUDA__index_select)Looks like either the model or data was not sent to cuda. What could be wrong? Thank you.