[Hyperloom] Optimize dsr1-fp8-mi355x-sglang, gptoss-fp4-mi355x-vllm by lishuoshuo-amd · Pull Request #2 · lishuoshuo-amd/InferenceX

lishuoshuo-amd · 2026-04-16T06:32:21Z

Description

Automated performance optimization update from Hyperloom CI.

dsr1-fp8-mi355x-sglang

Metric	Value
Baseline (tok/s/GPU)	309.88
Optimized (tok/s/GPU)	331.37
Optimization Gain	+6.9%
InferenceX Current (tok/s/GPU)	310.09
vs InferenceX	+6.9%

Server flag changes:

--num-continuous-decode-steps: 4 → 8

gptoss-fp4-mi355x-vllm

Metric	Value
Baseline (tok/s/GPU)	7344.93
Optimized (tok/s/GPU)	7855.14
Optimization Gain	+7.0%
InferenceX Current (tok/s/GPU)	6585.93
vs InferenceX	+19.3%

Server flag changes:

Add --max-num-seqs 256
Add --enable-chunked-prefill
Add --max-num-batched-tokens 16384

Related Issue

Automated by Hyperloom CI

Type of Change

Configuration change

Checklist

I have tested my changes locally
I have updated documentation if necessary
If I changed a container image or config, I have already updated perf-changelog.yaml

…4-mi355x-vllm - dsr1-fp8-mi355x-sglang: --num-continuous-decode-steps: 4 → 8 - gptoss-fp4-mi355x-vllm: Add --max-num-seqs 256; Add --enable-chunked-prefill ; Add --max-num-batched-tokens 16384

[Hyperloom CI] [Hyperloom] Optimize dsr1-fp8-mi355x-sglang, gptoss-fp…

458ddb0

…4-mi355x-vllm - dsr1-fp8-mi355x-sglang: --num-continuous-decode-steps: 4 → 8 - gptoss-fp4-mi355x-vllm: Add --max-num-seqs 256; Add --enable-chunked-prefill ; Add --max-num-batched-tokens 16384

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Hyperloom] Optimize dsr1-fp8-mi355x-sglang, gptoss-fp4-mi355x-vllm#2

[Hyperloom] Optimize dsr1-fp8-mi355x-sglang, gptoss-fp4-mi355x-vllm#2
lishuoshuo-amd wants to merge 1 commit intomainfrom
hyperloom/ci-20260416-0632

lishuoshuo-amd commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lishuoshuo-amd commented Apr 16, 2026

Description

dsr1-fp8-mi355x-sglang

gptoss-fp4-mi355x-vllm

Related Issue

Type of Change

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant