Nvunnam/combined pr801 803 804 by nvunnam57128 · Pull Request #805 · Xilinx/onnx-mlir

nvunnam57128 · 2026-06-01T17:49:35Z

No description provided.

…ting Add Mirrors the IPU-specific 1x1-input optimisation in xcompiler's ReplaceQDQResizePass (src/pass/passes/ReplaceQDQResizePass.cpp lines 200-282). When a quantised XFEResize takes a tensor of shape [N, 1, 1, C] and upsamples it to [N, H, W, C] (NHWC, with H>1 or W>1), the Resize is functionally a broadcast: there is only one source pixel per (N, C) so bilinear/nearest collapse to replication. The rewrite drops the Resize and emits %zero = onnx.Constant of shape [N, H, W, C] in the output quant type, storage value = output_zero_point (decodes to 0.0) %out = onnx.Add(%resize_input, %zero) ONNX Add broadcasts [N, 1, 1, C] + [N, H, W, C] -> [N, H, W, C], producing the same numerical result as the original Resize. The synthetic zero Add is then collapsed by downstream eltwise / const-fold passes (xcompiler's pipeline does the same: ReplaceQDQResizePass tags the eltwise with original_resize_opt=true, and a later fusion absorbs the zero into a downstream skip-connection Add). Match conditions: * single-use XFEResize * rank-4 static input AND output (NHWC) * input_shape[1] == 1 && input_shape[2] == 1 * output_shape[1] > 1 || output_shape[2] > 1 * batch and channel dims match across input/output * input and output are uniform quant types with matching scale/zp This avoids the backend qlinear_resize kernel for the corner-case 1x1->HxW shape that often fails or runs sub-optimally on IPU; observed on scene_parser_512_256_v2_int8 (Resize_173_8), PSO3, PSA2, PSA3 and mep_v2/K2. Placement: after ConvertToChannelLast (creates XFEResize), after the 5D->4D and transpose-optimisation passes (stable rank-4 NHWC), and right before ReplaceQuantizedTileToAddPass (its analogue for Tile) so the emitted onnx.Add is immediately lowered by ReplaceQDQEltwisePass. Co-authored-by: Cursor <cursoragent@cursor.com>

Apply repo .clang-format (LLVM + AlwaysBreakTemplateDeclarations: Yes + AlignAfterOpenBracket: DontAlign) to ReplaceQDQResizePass.cpp. Whitespace-only fix for the 4 violations reported by clang-format 20.0.0git on the previous commit (struct header break, two notifyMatchFailure continuation breaks, and the rewriter.create<ONNXConstantOp> continuation). Co-authored-by: Cursor <cursoragent@cursor.com>

…ot/Erf/Mod, normalize Softmax axis to last, disable Where

nvunnam57128 requested review from jorickert and p-lanza as code owners June 1, 2026 17:49

nvunnam57128 marked this pull request as draft June 1, 2026 17:49

nvunnam57128 marked this pull request as ready for review June 1, 2026 17:50

rachgupt-amd and others added 4 commits June 1, 2026 23:36

add recompose-hard-sigmoid pass and propagate alpha/beta to HSIGMOID

4401b49

transpose-opt vaip parity: log-only Softmax + LogSoftmax + Softplus/N…

c132477

…ot/Erf/Mod, normalize Softmax axis to last, disable Where

nvunnam57128 force-pushed the nvunnam/combined-pr800-801-804 branch from 8e7cb7f to c132477 Compare June 2, 2026 05:36

jorickert removed request for jorickert and p-lanza June 2, 2026 07:04

nvunnam57128 changed the title ~~Nvunnam/combined pr800 801 804~~ Nvunnam/combined pr801 803 804 Jun 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nvunnam/combined pr801 803 804#805

Nvunnam/combined pr801 803 804#805
nvunnam57128 wants to merge 4 commits into
Xilinx:feature/onnx-to-tosafrom
nvunnam57128:nvunnam/combined-pr800-801-804

nvunnam57128 commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

nvunnam57128 commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants