Skip to content

Add use_flash_attn for better FA + FA2 feature gating#825

Merged
alvarobartt merged 3 commits into
mainfrom
flash-attn-feature-gating
Feb 12, 2026
Merged

Add use_flash_attn for better FA + FA2 feature gating#825
alvarobartt merged 3 commits into
mainfrom
flash-attn-feature-gating

Conversation

@alvarobartt
Copy link
Copy Markdown
Member

What does this PR do?

This PR adds use_flash_attn as a function to check whether a given model architecture supports Flash Attention based on the active features + compute capability; and also capturing the value for the env var USE_FLASH_ATTENTION in an unified way.

This PR was inspired by both #778 and #809, which create similar functions.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline?
  • Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the documentation guidelines.
  • Did you write any new necessary tests? If applicable, did you include or update the insta snapshots?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

Co-Authored-By: Michael Feil <michaelfeil@users.noreply.github.com>
@alvarobartt alvarobartt added this to the v1.9.0 milestone Feb 12, 2026
@alvarobartt alvarobartt merged commit 5cdaee0 into main Feb 12, 2026
17 checks passed
@alvarobartt alvarobartt deleted the flash-attn-feature-gating branch February 12, 2026 15:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant