Nvunnam/combined pr795 797 798 by nvunnam57128 · Pull Request #800 · Xilinx/onnx-mlir

nvunnam57128 · 2026-05-31T02:38:02Z

No description provided.

…ranspose-optimization

…y-fold

…ughDataFlow

…tch vaip golden

When the quantized eltwise (Add/Mul/Sub/Div/Tanh/Sqrt) feeding an Relu/LeakyRelu has more than one user, Pattern 2 today clones the eltwise into the activation slot. The original eltwise survives because of its other user(s), and Pattern 1 then claims it. The result is two qlinear-eltwise ops with identical inputs that compute the same value, e.g. for scene_parser around input.115/input.119: %140 = xir.qlinear_eltwise %139, %137 ... op_type="ADD" %141 = xir.qlinear_eltwise %139, %137 ... op_type="ADD", nonlinear="RELU" This wastes a kernel invocation. The xmodel-flow on the same graph produces the equivalent "golden" form: %143 = xir.qlinear_eltwise %142, %140 op_type="ADD" %144 = xir.qlinear_eltwise %143 op_type="RELU" i.e., one bare ADD shared by both consumers and a standalone RELU chained on top. xmodel achieves this because get_template() picks up Relu standalone (its elew list includes "relu") whenever the through-Q/DQ fusion template (get_template_qlinear_eltwise_with_ single_relu) is gated off, or rejects on a multi-fanout filter. Mirror that behavior here: * FuseQuantizedEltwiseActivation::matchAndRewrite now requires eltwiseOp->hasOneUse(). When the eltwise has multiple users we refuse to fuse, letting Pattern 1 emit a standalone activation op (FuseQuantizedEltwiseWithoutActivation<ONNXReluOp> is already registered for this). * isInputFromPattern2Eltwise, which Pattern 1 uses to *defer* the standalone Relu/LeakyRelu rewrite when Pattern 2 can fuse, now returns false for multi-user eltwise so Pattern 1 takes over. Single-user Add -> Relu is unaffected; the existing single-fused op is still emitted (perf-optimal). Multi-user cases now match the xmodel-flow golden form and remove the duplicated eltwise compute. Co-authored-by: Cursor <cursoragent@cursor.com>

rachgupt-amd and others added 5 commits May 29, 2026 03:28

skip ABI bridges at function boundaries in propagate-quant-type and t…

ade5a70

…ranspose-optimization

Merge branch 'feature/onnx-to-tosa' into rachit/transpose-opt-boundar…

9450f13

…y-fold

add XmcRequantizePass and group propagation in PropagateQuantTypeThro…

0cf60f2

…ughDataFlow

transpose+reshape: refuse N-to-1 merging and absorb trailing 1s to ma…

47d6ab9

…tch vaip golden

nvunnam57128 force-pushed the nvunnam/combined-pr795-797-798 branch from e041d3c to 69598d5 Compare May 31, 2026 09:03

nvunnam57128 requested review from jorickert and p-lanza as code owners May 31, 2026 09:03

jorickert removed request for jorickert and p-lanza June 1, 2026 08:35

Merge branch 'feature/onnx-to-tosa' into nvunnam/combined-pr795-797-798

498a2dd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nvunnam/combined pr795 797 798#800

Nvunnam/combined pr795 797 798#800
nvunnam57128 wants to merge 6 commits into
Xilinx:feature/onnx-to-tosafrom
nvunnam57128:nvunnam/combined-pr795-797-798

nvunnam57128 commented May 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

nvunnam57128 commented May 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants