Skip to content

Insufficient shared memory available on the GPU when using python binding on A100 #511

@WanqiYuan

Description

@WanqiYuan

Hi there,

Thanks for releasing such an amazing code. I tried to replace several MLPs in my code with Fully-fused-MLP using tcnn python binding. I set the number of neurons as 64 and using A100. I compiled the python binding following the instructions on A100. However, I got an error:

FullyFusedMLP: insufficient shared memory available on the GPU. Reduce n_neurons or use CutlassMLP (better compatibility but slower) instead.

I think A100 should have a big shared memory and the layers or number of neurons of the Fully-fused-MLP are not big. Am I missing something when compiling?

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions