Skip to content

[pull] main from apache:main#190

Merged
pull[bot] merged 4 commits into
buraksenn:mainfrom
apache:main
May 14, 2026
Merged

[pull] main from apache:main#190
pull[bot] merged 4 commits into
buraksenn:mainfrom
apache:main

Conversation

@pull
Copy link
Copy Markdown

@pull pull Bot commented May 14, 2026

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

neilconway and others added 4 commits May 14, 2026 12:25
## Which issue does this PR close?

- Closes #21997 (potentially).

## Rationale for this change

This PR adds two new APIs to `GenericStringArrayBuilder` and
`StringViewArrayBuilder`:

1. `append_with` appends a row whose bytes are produced by invoking a
closure that is passed a `StringWriter`
2. `append_byte_map` appends a row whose bytes are produced by mapping
each byte of the input with a byte-to-byte map closure.

For `StringViewArrayBuilder`, `StringWriter` is an append-only string
writer that switches between writing to a new inline view (for short
strings) or to the in-progress data block automatically. For
`GenericStringArrayBuilder`, `StringWriter` just appends to the value
buffer directly.

(We need two new APIs because `append_byte_map` vectorizes a lot better
than `append_with`, so callers that fit the byte-to-byte map pattern
should prefer it.)

Both of these new APIs allow string UDFs to avoid creating an
intermediate data copy in many cases. To illustrate this, this PR adopts
the new APIs in `replace`.

Benchmarks (Arm64):

  Group 1: ASCII single-byte fast path (StringArray)

  - size=1024 str_len=32 nulls=0.0 : 16.27 µs -> 12.83 µs (−21.1%)
  - size=1024 str_len=32 nulls=0.2 : 14.23 µs -> 12.10 µs (−15.0%)
  - size=1024 str_len=128 nulls=0.0 : 11.28 µs -> 8.21 µs (−27.3%)
  - size=1024 str_len=128 nulls=0.2 : 10.37 µs -> 7.79 µs (−24.9%)
  - size=4096 str_len=32 nulls=0.0 : 62.48 µs -> 49.50 µs (−20.8%)
  - size=4096 str_len=32 nulls=0.2 : 55.74 µs -> 46.66 µs (−16.3%)
  - size=4096 str_len=128 nulls=0.0 : 42.26 µs -> 29.06 µs (−31.2%)
  - size=4096 str_len=128 nulls=0.2 : 39.17 µs -> 28.52 µs (−27.2%)

  Group 2: Multi-byte StringArray — general writer path

  - size=1024 str_len=32 nulls=0.0 : 23.58 µs -> 21.75 µs (−7.8%)
  - size=1024 str_len=32 nulls=0.2 : 18.92 µs -> 17.41 µs (−8.0%)
  - size=1024 str_len=128 nulls=0.0 : 37.56 µs -> 35.33 µs (−5.9%)
  - size=1024 str_len=128 nulls=0.2 : 29.62 µs -> 28.71 µs (−3.1%)
  - size=4096 str_len=32 nulls=0.0 : 97.15 µs -> 88.92 µs (−8.5%)
  - size=4096 str_len=32 nulls=0.2 : 77.03 µs -> 71.43 µs (−7.3%)
  - size=4096 str_len=128 nulls=0.0 : 173.66 µs -> 163.68 µs (−5.7%)
  - size=4096 str_len=128 nulls=0.2 : 134.98 µs -> 128.56 µs (−4.8%)

  Group 3: Multi-byte StringViewArray — general writer path

  - size=1024 str_len=32 nulls=0.0 : 24.46 µs -> 22.18 µs (−9.3%)
  - size=1024 str_len=32 nulls=0.2 : 20.04 µs -> 17.71 µs (−11.7%)
  - size=1024 str_len=128 nulls=0.0 : 36.43 µs -> 35.79 µs (−1.8%)
  - size=1024 str_len=128 nulls=0.2 : 29.73 µs -> 28.70 µs (−3.5%)
  - size=4096 str_len=32 nulls=0.0 : 99.07 µs -> 89.68 µs (−9.5%)
  - size=4096 str_len=32 nulls=0.2 : 84.38 µs -> 72.46 µs (−14.1%)
- size=4096 str_len=128 nulls=0.0 : 169.27 µs -> 164.80 µs (−2.6%, n.s.)
- size=4096 str_len=128 nulls=0.2 : 133.79 µs -> 130.20 µs (−2.7%, n.s.)

  Group 4: Empty-from StringArray

  - size=1024 str_len=32 : 87.75 µs -> 50.64 µs (−42.3%)
  - size=1024 str_len=128 : 313.00 µs -> 187.77 µs (−40.0%)

  Group 5: Empty-from StringViewArray

  - size=1024 str_len=32 : 87.01 µs -> 50.10 µs (−42.4%)
  - size=1024 str_len=128 : 313.99 µs -> 190.17 µs (−39.4%)

## What changes are included in this PR?

* Add `append_byte_map` and `append_with` to both of the bulk-NULL
string builders
* Add unit tests
* Adopt the new APIs in `replace`

## Are these changes tested?

Yes; new tests added.

## Are there any user-facing changes?

No.
## Which issue does this PR close?

- Closes #.

## Rationale for this change

`rand()` is a common alias for `random()` in SQL engines. Supporting it
improves compatibility and lets users write `rand()` as an equivalent
zero-argument volatile random function.

## What changes are included in this PR?

- Adds `rand` as an alias for the existing `random()` scalar function.
- Adds a sqllogictest case verifying that `rand()` resolves successfully
and returns a `Float64` value in the expected `[0, 1)` range.

## Are these changes tested?

Yes.

- `cargo fmt --all`
- `cargo test --package datafusion-functions random --lib`
- `cargo test --features backtrace,parquet_encryption --profile ci
--package datafusion-sqllogictest --test sqllogictests -- functions.slt`

## Are there any user-facing changes?

Yes. Users can now call `rand()` as an alias for `random()`.
Bumps [runs-on/action](https://github.com/runs-on/action) from 2.1.0 to
2.1.2.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/runs-on/action/releases">runs-on/action's
releases</a>.</em></p>
<blockquote>
<h2>v2.1.2</h2>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/runs-on/action/compare/v2.1.1...v2.1.2">https://github.com/runs-on/action/compare/v2.1.1...v2.1.2</a></p>
<h2>v2.1.1</h2>
<h2>What's Changed</h2>
<ul>
<li>Upgrade go, deps, and sanitize values better by <a
href="https://github.com/crohr"><code>@​crohr</code></a> in <a
href="https://redirect.github.com/runs-on/action/pull/30">runs-on/action#30</a></li>
<li>Propagate actions runtime token by <a
href="https://github.com/crohr"><code>@​crohr</code></a> in <a
href="https://redirect.github.com/runs-on/action/pull/34">runs-on/action#34</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/runs-on/action/compare/v2.1.0...v2.1.1">https://github.com/runs-on/action/compare/v2.1.0...v2.1.1</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/runs-on/action/commit/d141ef83eb66d096ce8afc767e09115a65c63b60"><code>d141ef8</code></a>
dist: rebuild binaries for v2.1.2</li>
<li><a
href="https://github.com/runs-on/action/commit/c5df5533f2cf2dec19ef109dcd2a6d28fae8928f"><code>c5df553</code></a>
Add manual release workflow with gpg signing and checksums</li>
<li><a
href="https://github.com/runs-on/action/commit/e46a3c6d62a5df0b0e0f8b5fdc50f04c5ecf147f"><code>e46a3c6</code></a>
dist: rebuild binaries</li>
<li><a
href="https://github.com/runs-on/action/commit/88629fc77cb7a6a251ccdf8290a55cd08928794e"><code>88629fc</code></a>
Send runtime token to Magic Cache config</li>
<li><a
href="https://github.com/runs-on/action/commit/6e9cb2b901c5953075dbcd45edf77b0192c8e6b1"><code>6e9cb2b</code></a>
Update actions</li>
<li><a
href="https://github.com/runs-on/action/commit/408de89233b56a743b98d68195bb1923d60543aa"><code>408de89</code></a>
dist: rebuild binaries</li>
<li><a
href="https://github.com/runs-on/action/commit/e8a2e6d65ae450e212faf12b84577c012cf764ab"><code>e8a2e6d</code></a>
Remove dead code: unused MetricSummary fields and
calculateMin/calculateMax f...</li>
<li><a
href="https://github.com/runs-on/action/commit/3a86586805f7eeb1ef42293156a97c326ff040fd"><code>3a86586</code></a>
dist: rebuild binaries</li>
<li><a
href="https://github.com/runs-on/action/commit/61a7be1daf74b5271d1238f1d4158fa3a5545ae6"><code>61a7be1</code></a>
build: upgrade to go 1.26</li>
<li>See full diff in <a
href="https://github.com/runs-on/action/compare/742bf56072eb4845a0f94b3394673e4903c90ff0...d141ef83eb66d096ce8afc767e09115a65c63b60">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=runs-on/action&package-manager=github_actions&previous-version=2.1.0&new-version=2.1.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
… bits (#22158)

## Which issue does this PR close?

N/A

## Rationale for this change

1. instead of counting set bits to check if there is at least 1 set
bits, we can use the existing helpers on `BooleanArray` that check if
there is at least 1 set bit
2. Avoid unnecessary `BooleanBuffer` bitwise operations and reuse mask 

## What changes are included in this PR?

reused mask, and use helper to check if at least one false

## Are these changes tested?

Existing tests

## Are there any user-facing changes?

No

------

Cc @gstvg, @comphead
@pull pull Bot locked and limited conversation to collaborators May 14, 2026
@pull pull Bot added the ⤵️ pull label May 14, 2026
@pull pull Bot merged commit 18a219c into buraksenn:main May 14, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants