Skip to content

Nvunnam/combined pr795 797 798#800

Open
nvunnam57128 wants to merge 6 commits into
Xilinx:feature/onnx-to-tosafrom
nvunnam57128:nvunnam/combined-pr795-797-798
Open

Nvunnam/combined pr795 797 798#800
nvunnam57128 wants to merge 6 commits into
Xilinx:feature/onnx-to-tosafrom
nvunnam57128:nvunnam/combined-pr795-797-798

Conversation

@nvunnam57128
Copy link
Copy Markdown

No description provided.

rachgupt-amd and others added 5 commits May 29, 2026 03:28
When the quantized eltwise (Add/Mul/Sub/Div/Tanh/Sqrt) feeding an
Relu/LeakyRelu has more than one user, Pattern 2 today clones the
eltwise into the activation slot. The original eltwise survives
because of its other user(s), and Pattern 1 then claims it. The
result is two qlinear-eltwise ops with identical inputs that compute
the same value, e.g. for scene_parser around input.115/input.119:

  %140 = xir.qlinear_eltwise %139, %137 ... op_type="ADD"
  %141 = xir.qlinear_eltwise %139, %137 ... op_type="ADD",
                                            nonlinear="RELU"

This wastes a kernel invocation. The xmodel-flow on the same graph
produces the equivalent "golden" form:

  %143 = xir.qlinear_eltwise %142, %140  op_type="ADD"
  %144 = xir.qlinear_eltwise %143         op_type="RELU"

i.e., one bare ADD shared by both consumers and a standalone RELU
chained on top. xmodel achieves this because get_template() picks up
Relu standalone (its elew list includes "relu") whenever the
through-Q/DQ fusion template (get_template_qlinear_eltwise_with_
single_relu) is gated off, or rejects on a multi-fanout filter.

Mirror that behavior here:

* FuseQuantizedEltwiseActivation::matchAndRewrite now requires
  eltwiseOp->hasOneUse(). When the eltwise has multiple users we
  refuse to fuse, letting Pattern 1 emit a standalone activation op
  (FuseQuantizedEltwiseWithoutActivation<ONNXReluOp> is already
  registered for this).

* isInputFromPattern2Eltwise, which Pattern 1 uses to *defer* the
  standalone Relu/LeakyRelu rewrite when Pattern 2 can fuse, now
  returns false for multi-user eltwise so Pattern 1 takes over.

Single-user Add -> Relu is unaffected; the existing single-fused
op is still emitted (perf-optimal). Multi-user cases now match the
xmodel-flow golden form and remove the duplicated eltwise compute.

Co-authored-by: Cursor <cursoragent@cursor.com>
@nvunnam57128 nvunnam57128 force-pushed the nvunnam/combined-pr795-797-798 branch from e041d3c to 69598d5 Compare May 31, 2026 09:03
@jorickert jorickert removed request for jorickert and p-lanza June 1, 2026 08:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants