feat: Add num_rows and TaskContext to CometUDFBridge.evaluate#4306
feat: Add num_rows and TaskContext to CometUDFBridge.evaluate#4306mbutrovich wants to merge 6 commits into
Conversation
andygrove
left a comment
There was a problem hiding this comment.
LGTM with one question for my understanding
|
Dropping
Even with that fixed, Workable fixes (move the suite into |
Which issue does this PR close?
Closes #.
Rationale for this change
CometUDFs can run on tokio threads while the original task thread is parked, so you can't just reliably retrieve it from Spark. We now stash the TaskContext on the native side via the planner. Also, we need to know the num_rows for CometUDFs that don't take input columns. These are changes already in #4267.
What changes are included in this PR?
Thread through TaskContext and num_rows over CometUDFBridge.
How are these changes tested?
No new tests. Nothing in production on this branch invokes the bridge yet, so end-to-end coverage lands with #4267 when the dispatcher drives it for real. An earlier unit suite was dropped because of the Arrow shading boundary in
common/: the suite compiled against unshaded Arrow but CI runs against the shaded jar, and the test-definedCometUDFsubclasses cannot override the shaded interface from outsidecommon/.