Benchmark/v5.0.0#6
Draft
chilianyi wants to merge 1098 commits into
Draft
Conversation
and add test case for input_guard
Kicking this in for testing. The new step only works after it lives on the `master` branch.
Removed 'labeled' and 'unlabeled' types from pull request triggers. Signed-off-by: Qiming Teng <tengqm@outlook.com>
**Summary** - Fixes Kunlunxin `bmm_out` call signature to match `bmm_kernel`, avoiding duplicate constexpr binding errors (e.g., `TILE_M`). - Removes unintended stride arguments that are not accepted by the kernel. **Why** - `bmm_out` was passing extra parameters, leading to a runtime `TypeError` during JIT binding in tests.
* add test_perf_reshape_and_cache add benchmark for reshape_and_cache * update core shapes
* add test_perf_per_token_group_quant_fp8 add benchmark for per_token_group_quant_fp8 * Update core_shapes.yaml --------- Signed-off-by: kiddyjinjin <54064850+kiddyjinjin@users.noreply.github.com> Co-authored-by: kiddyjinjin <54064850+kiddyjinjin@users.noreply.github.com>
* add test_perf_concat_and_cache_mla add benchmark for concat_and_cache_mla * fix code-format-check * Update core_shapes.yaml * fix code-style --------- Signed-off-by: kiddyjinjin <54064850+kiddyjinjin@users.noreply.github.com> Co-authored-by: kiddyjinjin <54064850+kiddyjinjin@users.noreply.github.com>
This reverts commit 97c7e18.
* Add special_i0e operator implementation, tests and benchmark - Migrated from experimental_ops to ops - Added unit tests in tests/test_unary_pointwise_ops.py - Added to forward_operations in benchmark/test_unary_pointwise_perf.py - All 18 unit tests passed - Speedup: 1.0-1.9x across dtypes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: add KernelGen source comment * fix: codestyle fixes from pre-commit --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
… ops (#1912) * Migrate _upsample_nearest_exact1d from experimental_ops to ops Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: correct isort ordering in ops/__init__.py * chore: add KernelGen source comment * feat: add logger.debug for tracing * fix: add missing blank lines in test_special_ops.py * fix: repair broken lift_fresh_copy benchmark * fix: remove blank lines between decorators (E304) and add missing assert (F841) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* Migrate logit_ from experimental to main ops Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use sigmoid-generated input in logit_ test for numerical stability Match the experimental test pattern using torch.sigmoid(uniform(-4,4)) to generate inputs in (0,1) range. Add manual_seed for reproducibility. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add upcast=True for logit_ reference comparison Without upcast, ref stays in float16 causing precision mismatch with the Triton kernel which computes in float32. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: rename kernel from logit__kernel to logit_kernel for consistency * feat: add logger.debug for tracing * fix: add missing blank lines in test_unary_pointwise_ops.py * fix: add missing assert in absolute test and fix formatting * fix: add missing assert for rrelu_with_noise_backward * fix: codestyle fixes from pre-commit --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: factnn <factnn@example.com>
* [kunlunxin] fix arange operation * add pad
The CI job failure is not introduced in this PR.
Signed-off-by: Qiming Teng <tengqm@outlook.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR Category
Type of Change
Description
Issue
Progress
Performance