Skip to content

Tunable GPU block sizes #735

@JPRichings

Description

@JPRichings

Noticed that

const int NUM_THREADS_PER_BLOCK = 128;

is fixed for all target hardware and is a bit large for common tuning recommendations.

Plan to change this to allow a compile time default and a setter-getter interface to allow performance tuning tests.

Metadata

Metadata

Assignees

No fields configured for Feature.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions