-
Notifications
You must be signed in to change notification settings - Fork 1.9k
WIP: Upgrade DataFusion to arrow-rs/parquet 58.0.0 / object_store 13.0.0
#19728
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| | alltypes_plain.parquet | 1851 | 8882 | 2 | page_index=false | | ||
| | alltypes_tiny_pages.parquet | 454233 | 269266 | 2 | page_index=true | | ||
| | lz4_raw_compressed_larger.parquet | 380836 | 1347 | 2 | page_index=false | | ||
| | alltypes_tiny_pages.parquet | 454233 | 269074 | 2 | page_index=true | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this reduction in metadata size is a direct consequence of @WaterWhisperer's PR to improve PageEncoding representation
|
Run benchmarks |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
run benchmarks |
|
run benchmark tpch |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
🤖: Benchmark completed Details
|
object_store 13.0.0
| let timestamp = Utc::now(); | ||
| let range = options.range.clone(); | ||
|
|
||
| let head = options.head; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A substantial amount of the changes in tis PR are due to the upgrade to object_store 0.13 where several of the trait methods are consolidated (e.g. get, get_opts, head, etc) have been consolidated.
You can see the upgrade guide here
Sadly, the docs.rs page is broken, and I have filed a ticket for that:
| self.inner.list_with_delimiter(prefix).await | ||
| } | ||
|
|
||
| async fn copy(&self, from: &Path, to: &Path) -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy and copy_if_not_exists were consolidated
| // Testing case1: | ||
| // Inserting query schema mismatch: Expected table field 'a' with type Float16, but got 'a' with type Utf8. | ||
| // And the cast is not supported from Utf8 to Float16. | ||
| // And the cast is not supported from Binary to Float16. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Jefffrey added support for Utf8->Float casting in apache/arrow-rs#9262 so this test started failing it expected this not to work 😆
|
run benchmarks |
|
🤖 |
|
🤖: Benchmark completed Details
|
Which issue does this PR close?
58.0.0(January 2026) arrow-rs#8466Oustanding issues
Rationale for this change
Keep datafusion up to date (and test Arrow using DataFusion tests)
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?