Skip to content

feat: Add num_rows and TaskContext to CometUDFBridge.evaluate#4306

Open
mbutrovich wants to merge 6 commits into
apache:mainfrom
mbutrovich:udf-bridge-numrows-taskcontext
Open

feat: Add num_rows and TaskContext to CometUDFBridge.evaluate#4306
mbutrovich wants to merge 6 commits into
apache:mainfrom
mbutrovich:udf-bridge-numrows-taskcontext

Conversation

@mbutrovich
Copy link
Copy Markdown
Contributor

@mbutrovich mbutrovich commented May 12, 2026

Which issue does this PR close?

Closes #.

Rationale for this change

CometUDFs can run on tokio threads while the original task thread is parked, so you can't just reliably retrieve it from Spark. We now stash the TaskContext on the native side via the planner. Also, we need to know the num_rows for CometUDFs that don't take input columns. These are changes already in #4267.

What changes are included in this PR?

Thread through TaskContext and num_rows over CometUDFBridge.

How are these changes tested?

No new tests. Nothing in production on this branch invokes the bridge yet, so end-to-end coverage lands with #4267 when the dispatcher drives it for real. An earlier unit suite was dropped because of the Arrow shading boundary in common/: the suite compiled against unshaded Arrow but CI runs against the shaded jar, and the test-defined CometUDF subclasses cannot override the shaded interface from outside common/.

@mbutrovich mbutrovich requested a review from andygrove May 12, 2026 18:54
Comment thread common/src/main/java/org/apache/comet/udf/CometUdfBridge.java Outdated
Copy link
Copy Markdown
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with one question for my understanding

@mbutrovich
Copy link
Copy Markdown
Contributor Author

Dropping CometUdfBridgeSuite from this PR.

common/ shades org.apache.arrow.* to org.apache.comet.shaded.arrow.* at the package phase. CI runs ./mvnw -Prelease install, so the shaded jar is on the test classpath. The suite was compiled against unshaded Arrow, hence the NoSuchMethodError on CometArrowAllocator().

Even with that fixed, RowCountTestUDF extends CometUDF is compiled against unshaded ValueVector while the runtime interface uses the shaded one, so the override does not bind and the bridge would hit AbstractMethodError. The override has to be produced inside common/ for the shade plugin to rewrite it.

Workable fixes (move the suite into common/src/test/, or move the helper UDFs into common/src/main/) all add cost or ship test fixtures in the published jar. Nothing in production on this branch invokes the bridge yet, so end-to-end coverage lands with #4267 when the dispatcher drives it for real.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants