Skip to content

How is Qwen3.6-35B-A3B with PFlash+DFlash+DDTree? #161

@TomLucidor

Description

@TomLucidor

Saw Laguna-XS.2 tested for MoE + SWA (which is closer to Gemma 4), but not Qwen3.5/Qwen3.6 where MoE is mixed with linear attention. Wondering if it would still yield 5x prefill/TTFT and 3x decode/throughput

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions