Skip to content

fold decomposed HardSigmoid to match golden vaip HSIGMOID#801

Open
rachgupt-amd wants to merge 1 commit into
feature/onnx-to-tosafrom
rachit/hsigmoid-decomposed-pattern
Open

fold decomposed HardSigmoid to match golden vaip HSIGMOID#801
rachgupt-amd wants to merge 1 commit into
feature/onnx-to-tosafrom
rachit/hsigmoid-decomposed-pattern

Conversation

@rachgupt-amd
Copy link
Copy Markdown

No description provided.

@rachgupt-amd rachgupt-amd force-pushed the rachit/hsigmoid-decomposed-pattern branch from bcf72ec to bf69587 Compare May 31, 2026 10:06
@rachgupt-amd rachgupt-amd force-pushed the rachit/hsigmoid-decomposed-pattern branch from bf69587 to 2a02d69 Compare May 31, 2026 10:44
@rachgupt-amd rachgupt-amd force-pushed the rachit/hsigmoid-decomposed-pattern branch from 2a02d69 to bf17ff6 Compare May 31, 2026 10:48
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use the existing RecomposeHardSigmoidFromMulClipPattern?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have some specific constraints in xcompiler flow that's why added a new pass:

1.) Original ONNX transform pass preserves the exact source constants. Xcompiler emits alpha=0.2, beta=0.5 (ONNX defaults via clone_attrs(relu6)) as backend requirement. For bit-exact golden parity our pass snaps to those defaults, which would be a semantics-breaking change in the upstream pattern.

2.) The upstream pattern accepts any alpha > 0. The DPU only supports the canonical (x+3)/6 form, so we strictly require alpha ≈ 1/6 and beta ≈ 0.5 (mirroring xcompiler == 3 check) and bail out otherwise.

3.) Matting's Xcompiler model export has clipMax = 1.00012207 (FP drift). The upstream isConstOf(clipMax, 1.0) is strict-equality. Loosening that there would silently re-fold legit clip(., 0, 1.001) chains in CPU/TOSA lowerings; we want the tolerance gated by the canonical alpha/beta check, which only makes sense in a Xcompiler pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants