Blackwell GPU compatibility

On Blackwell GPU, I got the following messages:

```
$ bash inference_demo.sh
......
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
:
Traceback (most recent call last):
  File "/opt/modeling/molecule-generation/ODesign/src/utils/inference/infer_runner.py", line 201, in run
    pred_backbone_output, all_sequence_variants = self.predict(data)
  File "/opt/anaconda3/envs/odesign/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/opt/modeling/molecule-generation/ODesign/src/utils/inference/infer_runner.py", line 114, in predict
    data = to_device(data, self.device)
  File "/opt/modeling/molecule-generation/ODesign/src/utils/model/torch_utils.py", line 81, in to_device
    obj[k] = to_device(v, device)
  File "/opt/modeling/molecule-generation/ODesign/src/utils/model/torch_utils.py", line 100, in to_device
    return attr.evolve(obj, **updates)
  File "/opt/anaconda3/envs/odesign/lib/python3.10/site-packages/attr/_make.py", line 634, in evolve
    return cls(**changes)
  File "<attrs generated methods src.api.data_interface.OFeatureData>", line 124, in __init__
    self.__attrs_post_init__()
  File "/opt/modeling/molecule-generation/ODesign/src/api/_base.py", line 108, in __attrs_post_init__
    convert_types(self)
  File "/opt/modeling/molecule-generation/ODesign/src/api/_base.py", line 102, in convert_types
    converted_value = converter(current_value)
  File "/opt/modeling/molecule-generation/ODesign/src/api/_base.py", line 17, in to_mask_type
    return tensor.bool()
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
```

I wish there would be ODesign versions for cuda 13.0.
But required python packages such as torch-scatter should also support this cuda version, so I guess running ODesign on Blackwell may not be feasible this time.
Any workarounds?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Blackwell GPU compatibility #5

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Blackwell GPU compatibility #5

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions