Skip to content

feat: enable debug assertions in CI profile, fix unaligned memory access bug#3652

Merged
andygrove merged 5 commits intoapache:mainfrom
andygrove:ci-debug-assertions
Mar 10, 2026
Merged

feat: enable debug assertions in CI profile, fix unaligned memory access bug#3652
andygrove merged 5 commits intoapache:mainfrom
andygrove:ci-debug-assertions

Conversation

@andygrove
Copy link
Copy Markdown
Member

@andygrove andygrove commented Mar 9, 2026

Closes #3653

Summary

  • Enable debug-assertions = true in the CI cargo profile to catch potential issues with unsafe code at test time
  • Enable RUST_BACKTRACE=1 in all CI workflows to get full stack traces when panics occur

Rationale

We want to verify whether any debug assertions fire on the main branch, particularly in the columnar-to-row and other unsafe code paths.

…kflows

Enable debug-assertions in the CI cargo profile to catch potential
issues with unsafe code. Enable RUST_BACKTRACE=1 in all CI workflows
to get full stack traces when panics occur.
@andygrove andygrove changed the title feat: enable debug assertions in CI profile and RUST_BACKTRACE feat: enable debug assertions in CI profile Mar 9, 2026
Log panic messages (including debug assertion text and location) to
stderr in the panic hook so they are visible in CI logs. Wrap the
spawned tokio execution task with catch_unwind so panics propagate
as DataFusionError instead of silently dropping the channel.
@andygrove
Copy link
Copy Markdown
Member Author

Comet native panic: panicked at core/src/execution/shuffle/spark_unsafe/row.rs:293:29:
misaligned pointer dereference: address must be a multiple of 0x8 but is 0xc1a80000c

Spark's UnsafeRow data from JVM may not be 8-byte aligned. The null
bitset accessors were casting to *const i64 and dereferencing directly,
which triggers a panic with debug assertions enabled. Use
read_unaligned/write_unaligned instead.
@andygrove andygrove changed the title feat: enable debug assertions in CI profile feat: enable debug assertions in CI profile, fix unaligned memory access bug Mar 9, 2026
@andygrove andygrove requested a review from mbutrovich March 9, 2026 23:51
@andygrove
Copy link
Copy Markdown
Member Author

@mbutrovich I think this explains the bus error you saw recently

while let Some(batch) = stream.next().await {
if tx.send(batch).await.is_err() {
break;
let result = std::panic::AssertUnwindSafe(async {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes are for catching panics that happen in tokio threads

Comment thread native/core/src/errors.rs
Comment on lines +149 to +151
// Log the panic message and location to stderr so it is visible in CI logs
// even if the exception message is lost crossing the FFI boundary
eprintln!("Comet native panic: {panic_info}");
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the main change for improving reporting of panics

Copy link
Copy Markdown
Contributor

@parthchandra parthchandra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get this in. It's important to catch these issues early.

@andygrove andygrove merged commit 1772097 into apache:main Mar 10, 2026
115 checks passed
@andygrove
Copy link
Copy Markdown
Member Author

Thanks @parthchandra

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Spark SQL test failure when debug assertions are enabled

2 participants