You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Are --max-prefill-tokens and --decode-bs the number of tokens for a single GPU or the total number of tokens for the entire request?
It appears that they refer to the number of tokens per GPU.