[pull] main from apache:main#190
Merged
Merged
Conversation
## Which issue does this PR close? - Closes #21997 (potentially). ## Rationale for this change This PR adds two new APIs to `GenericStringArrayBuilder` and `StringViewArrayBuilder`: 1. `append_with` appends a row whose bytes are produced by invoking a closure that is passed a `StringWriter` 2. `append_byte_map` appends a row whose bytes are produced by mapping each byte of the input with a byte-to-byte map closure. For `StringViewArrayBuilder`, `StringWriter` is an append-only string writer that switches between writing to a new inline view (for short strings) or to the in-progress data block automatically. For `GenericStringArrayBuilder`, `StringWriter` just appends to the value buffer directly. (We need two new APIs because `append_byte_map` vectorizes a lot better than `append_with`, so callers that fit the byte-to-byte map pattern should prefer it.) Both of these new APIs allow string UDFs to avoid creating an intermediate data copy in many cases. To illustrate this, this PR adopts the new APIs in `replace`. Benchmarks (Arm64): Group 1: ASCII single-byte fast path (StringArray) - size=1024 str_len=32 nulls=0.0 : 16.27 µs -> 12.83 µs (−21.1%) - size=1024 str_len=32 nulls=0.2 : 14.23 µs -> 12.10 µs (−15.0%) - size=1024 str_len=128 nulls=0.0 : 11.28 µs -> 8.21 µs (−27.3%) - size=1024 str_len=128 nulls=0.2 : 10.37 µs -> 7.79 µs (−24.9%) - size=4096 str_len=32 nulls=0.0 : 62.48 µs -> 49.50 µs (−20.8%) - size=4096 str_len=32 nulls=0.2 : 55.74 µs -> 46.66 µs (−16.3%) - size=4096 str_len=128 nulls=0.0 : 42.26 µs -> 29.06 µs (−31.2%) - size=4096 str_len=128 nulls=0.2 : 39.17 µs -> 28.52 µs (−27.2%) Group 2: Multi-byte StringArray — general writer path - size=1024 str_len=32 nulls=0.0 : 23.58 µs -> 21.75 µs (−7.8%) - size=1024 str_len=32 nulls=0.2 : 18.92 µs -> 17.41 µs (−8.0%) - size=1024 str_len=128 nulls=0.0 : 37.56 µs -> 35.33 µs (−5.9%) - size=1024 str_len=128 nulls=0.2 : 29.62 µs -> 28.71 µs (−3.1%) - size=4096 str_len=32 nulls=0.0 : 97.15 µs -> 88.92 µs (−8.5%) - size=4096 str_len=32 nulls=0.2 : 77.03 µs -> 71.43 µs (−7.3%) - size=4096 str_len=128 nulls=0.0 : 173.66 µs -> 163.68 µs (−5.7%) - size=4096 str_len=128 nulls=0.2 : 134.98 µs -> 128.56 µs (−4.8%) Group 3: Multi-byte StringViewArray — general writer path - size=1024 str_len=32 nulls=0.0 : 24.46 µs -> 22.18 µs (−9.3%) - size=1024 str_len=32 nulls=0.2 : 20.04 µs -> 17.71 µs (−11.7%) - size=1024 str_len=128 nulls=0.0 : 36.43 µs -> 35.79 µs (−1.8%) - size=1024 str_len=128 nulls=0.2 : 29.73 µs -> 28.70 µs (−3.5%) - size=4096 str_len=32 nulls=0.0 : 99.07 µs -> 89.68 µs (−9.5%) - size=4096 str_len=32 nulls=0.2 : 84.38 µs -> 72.46 µs (−14.1%) - size=4096 str_len=128 nulls=0.0 : 169.27 µs -> 164.80 µs (−2.6%, n.s.) - size=4096 str_len=128 nulls=0.2 : 133.79 µs -> 130.20 µs (−2.7%, n.s.) Group 4: Empty-from StringArray - size=1024 str_len=32 : 87.75 µs -> 50.64 µs (−42.3%) - size=1024 str_len=128 : 313.00 µs -> 187.77 µs (−40.0%) Group 5: Empty-from StringViewArray - size=1024 str_len=32 : 87.01 µs -> 50.10 µs (−42.4%) - size=1024 str_len=128 : 313.99 µs -> 190.17 µs (−39.4%) ## What changes are included in this PR? * Add `append_byte_map` and `append_with` to both of the bulk-NULL string builders * Add unit tests * Adopt the new APIs in `replace` ## Are these changes tested? Yes; new tests added. ## Are there any user-facing changes? No.
## Which issue does this PR close? - Closes #. ## Rationale for this change `rand()` is a common alias for `random()` in SQL engines. Supporting it improves compatibility and lets users write `rand()` as an equivalent zero-argument volatile random function. ## What changes are included in this PR? - Adds `rand` as an alias for the existing `random()` scalar function. - Adds a sqllogictest case verifying that `rand()` resolves successfully and returns a `Float64` value in the expected `[0, 1)` range. ## Are these changes tested? Yes. - `cargo fmt --all` - `cargo test --package datafusion-functions random --lib` - `cargo test --features backtrace,parquet_encryption --profile ci --package datafusion-sqllogictest --test sqllogictests -- functions.slt` ## Are there any user-facing changes? Yes. Users can now call `rand()` as an alias for `random()`.
Bumps [runs-on/action](https://github.com/runs-on/action) from 2.1.0 to 2.1.2. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/runs-on/action/releases">runs-on/action's releases</a>.</em></p> <blockquote> <h2>v2.1.2</h2> <p><strong>Full Changelog</strong>: <a href="https://github.com/runs-on/action/compare/v2.1.1...v2.1.2">https://github.com/runs-on/action/compare/v2.1.1...v2.1.2</a></p> <h2>v2.1.1</h2> <h2>What's Changed</h2> <ul> <li>Upgrade go, deps, and sanitize values better by <a href="https://github.com/crohr"><code>@crohr</code></a> in <a href="https://redirect.github.com/runs-on/action/pull/30">runs-on/action#30</a></li> <li>Propagate actions runtime token by <a href="https://github.com/crohr"><code>@crohr</code></a> in <a href="https://redirect.github.com/runs-on/action/pull/34">runs-on/action#34</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/runs-on/action/compare/v2.1.0...v2.1.1">https://github.com/runs-on/action/compare/v2.1.0...v2.1.1</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/runs-on/action/commit/d141ef83eb66d096ce8afc767e09115a65c63b60"><code>d141ef8</code></a> dist: rebuild binaries for v2.1.2</li> <li><a href="https://github.com/runs-on/action/commit/c5df5533f2cf2dec19ef109dcd2a6d28fae8928f"><code>c5df553</code></a> Add manual release workflow with gpg signing and checksums</li> <li><a href="https://github.com/runs-on/action/commit/e46a3c6d62a5df0b0e0f8b5fdc50f04c5ecf147f"><code>e46a3c6</code></a> dist: rebuild binaries</li> <li><a href="https://github.com/runs-on/action/commit/88629fc77cb7a6a251ccdf8290a55cd08928794e"><code>88629fc</code></a> Send runtime token to Magic Cache config</li> <li><a href="https://github.com/runs-on/action/commit/6e9cb2b901c5953075dbcd45edf77b0192c8e6b1"><code>6e9cb2b</code></a> Update actions</li> <li><a href="https://github.com/runs-on/action/commit/408de89233b56a743b98d68195bb1923d60543aa"><code>408de89</code></a> dist: rebuild binaries</li> <li><a href="https://github.com/runs-on/action/commit/e8a2e6d65ae450e212faf12b84577c012cf764ab"><code>e8a2e6d</code></a> Remove dead code: unused MetricSummary fields and calculateMin/calculateMax f...</li> <li><a href="https://github.com/runs-on/action/commit/3a86586805f7eeb1ef42293156a97c326ff040fd"><code>3a86586</code></a> dist: rebuild binaries</li> <li><a href="https://github.com/runs-on/action/commit/61a7be1daf74b5271d1238f1d4158fa3a5545ae6"><code>61a7be1</code></a> build: upgrade to go 1.26</li> <li>See full diff in <a href="https://github.com/runs-on/action/compare/742bf56072eb4845a0f94b3394673e4903c90ff0...d141ef83eb66d096ce8afc767e09115a65c63b60">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
… bits (#22158) ## Which issue does this PR close? N/A ## Rationale for this change 1. instead of counting set bits to check if there is at least 1 set bits, we can use the existing helpers on `BooleanArray` that check if there is at least 1 set bit 2. Avoid unnecessary `BooleanBuffer` bitwise operations and reuse mask ## What changes are included in this PR? reused mask, and use helper to check if at least one false ## Are these changes tested? Existing tests ## Are there any user-facing changes? No ------ Cc @gstvg, @comphead
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )