fold decomposed HardSigmoid to match golden vaip HSIGMOID#801
fold decomposed HardSigmoid to match golden vaip HSIGMOID#801rachgupt-amd wants to merge 1 commit into
Conversation
bcf72ec to
bf69587
Compare
bf69587 to
2a02d69
Compare
2a02d69 to
bf17ff6
Compare
There was a problem hiding this comment.
Why not use the existing RecomposeHardSigmoidFromMulClipPattern?
There was a problem hiding this comment.
we have some specific constraints in xcompiler flow that's why added a new pass:
1.) Original ONNX transform pass preserves the exact source constants. Xcompiler emits alpha=0.2, beta=0.5 (ONNX defaults via clone_attrs(relu6)) as backend requirement. For bit-exact golden parity our pass snaps to those defaults, which would be a semantics-breaking change in the upstream pattern.
2.) The upstream pattern accepts any alpha > 0. The DPU only supports the canonical (x+3)/6 form, so we strictly require alpha ≈ 1/6 and beta ≈ 0.5 (mirroring xcompiler == 3 check) and bail out otherwise.
3.) Matting's Xcompiler model export has clipMax = 1.00012207 (FP drift). The upstream isConstOf(clipMax, 1.0) is strict-equality. Loosening that there would silently re-fold legit clip(., 0, 1.001) chains in CPU/TOSA lowerings; we want the tolerance gated by the canonical alpha/beta check, which only makes sense in a Xcompiler pass.
No description provided.