Skip to content

Relax SM90 shared-memory estimate for wide values#16

Merged
ivandobskygithub merged 2 commits intomainfrom
codex/fix-flashattention-shared-memory-error-plgj0k
Nov 25, 2025
Merged

Relax SM90 shared-memory estimate for wide values#16
ivandobskygithub merged 2 commits intomainfrom
codex/fix-flashattention-shared-memory-error-plgj0k

Conversation

@ivandobskygithub
Copy link
Owner

Summary

  • reduce the shared-memory buffering estimate for value dimensions 256+ to avoid over-clamping SM90 tile sizes
  • keep the shared-memory budget test in sync with the updated buffering heuristic

Testing

  • pytest tests/hopper/test_tile_size_shared_memory.py -q

Codex Task

@ivandobskygithub ivandobskygithub merged commit b8086a2 into main Nov 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant