-
Notifications
You must be signed in to change notification settings - Fork 458
[lake/paimon] Support NestedRow types for tiering paimon #2260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[lake/paimon] Support NestedRow types for tiering paimon #2260
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds support for NestedRow types in Paimon lake tiering by implementing the previously unsupported getRow() methods in the data conversion layer between Fluss and Paimon.
Key Changes:
- Implemented recursive nested row conversion by creating new
FlussRowAsPaimonRowinstances for nested row fields - Extended array support to handle arrays containing nested row elements
- Added 8 comprehensive test cases covering simple nested rows, deeply nested structures, arrays of rows, null handling, and various data types
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| FlussRowAsPaimonRow.java | Implemented getRow() method to convert nested Fluss rows to Paimon rows recursively with proper null handling |
| FlussArrayAsPaimonArray.java | Implemented getRow() method to support arrays containing nested row elements |
| FlussRecordAsPaimonRowTest.java | Added 8 test methods covering nested row scenarios including primitive types, complex types, null handling, deep nesting, and arrays of rows |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
luoyuxia
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@XuQianJin-Stars Thanks for the pr. Let's add two test:
- add test in PaimonRowAsFlussRow for nestrow type
- add IT for nested-row type, you can refer to what we do for array type
| } | ||
|
|
||
| @Test | ||
| void testSimpleNestedRow() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please combine all test methods for NestedRow to a single method just like main branch for array type?
Two many methods make it hard to track...
f5ef8a4 to
8280f47
Compare
- Created FlussNestedRowAsPaimonRow adapter class to convert Fluss nested rows to Paimon nested rows - Implemented getRow method in FlussRowAsPaimonRow to support nested row fields in tables - Implemented getRow method in FlussArrayAsPaimonArray to support arrays of nested rows - Added comprehensive test cases covering: * Simple nested rows with primitive types * Deeply nested rows (row within row) * Arrays of nested rows * Nested rows with array fields * Nested rows with all primitive types * Null nested rows * Nested rows with nullable fields * Nested rows with decimal and timestamp fields
8280f47 to
48fa51d
Compare
|
@luoyuxia Hi, i already updated the pr. Please help review when you got some time |
| } | ||
|
|
||
| @ParameterizedTest | ||
| @MethodSource("tieringNestedRowWriteArgs") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace by @ValueSource(booleans = {true, false}) and remove tieringNestedRowWriteArgs method.
d678a38 to
fe6404d
Compare
|
@wuchong Hi, i already updated the pr. Please help review when you got some time. |
Purpose
Linked issue: close #2251
This PR adds support for NestedRow types in Paimon lake tiering.
Brief change log
FlussRowAsPaimonRow.getRow()to support nested row fields by creating newFlussRowAsPaimonRowinstances recursivelyFlussArrayAsPaimonArray.getRow()to support arrays containing nested row elementsFlussRecordAsPaimonRowTestcovering various nested row scenarios:Tests
Unit Tests:
FlussRecordAsPaimonRowTest#testSimpleNestedRow- Tests basic nested row conversionFlussRecordAsPaimonRowTest#testDeeplyNestedRow- Tests multi-level nested rowsFlussRecordAsPaimonRowTest#testArrayOfNestedRows- Tests arrays containing nested rowsFlussRecordAsPaimonRowTest#testNestedRowWithArrayField- Tests nested rows with array fieldsFlussRecordAsPaimonRowTest#testNestedRowWithAllPrimitiveTypes- Tests all primitive types in nested rowsFlussRecordAsPaimonRowTest#testNullNestedRow- Tests null nested row handlingFlussRecordAsPaimonRowTest#testNestedRowWithNullableFields- Tests nullable fields in nested rowsFlussRecordAsPaimonRowTest#testNestedRowWithDecimalAndTimestamp- Tests complex types in nested rowsAll tests verify correct data conversion between Fluss nested rows and Paimon nested rows.
API and Format
API: No public API changes
Format: No storage format changes. This change only affects the internal data conversion layer between Fluss and Paimon.
Documentation
New Feature: No
This change extends existing Paimon lake tiering functionality to support NestedRow types, which is an internal enhancement. The data type conversion mapping table in the documentation (
website/docs/streaming-lakehouse/integrate-data-lakes/paimon.md) already covers ROW types, so no documentation update is required.