-
Notifications
You must be signed in to change notification settings - Fork 2.5k
perf: use shallow projection where applicable #17682
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@hudi-bot run azure |
1 similar comment
|
@hudi-bot run azure |
|
@danny0405 Please take a look |
| } else { | ||
| GenericData.Record rec = new GenericData.Record(targetSchema); | ||
| for (Schema.Field field : targetSchema.getFields()) { | ||
| if (record.hasField(field.name())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
record.hasField and record.get will both search field in the internal schema, suggested changes.
Field sourceField = record.getSchema().getField(field.name());
if (sourceField == null) {
rec.put(field.pos(), null);
} else {
rec.put(field.pos(), record.get(sourceField.pos()));
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
82f4c1c to
7a08b71
Compare
|
@danny0405 @cshuo |
| * the schemas are identical in field count. | ||
| */ | ||
|
|
||
| public static GenericRecord projectRecordToNewSchemaShallow(GenericRecord record, Schema targetSchema) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's safer to keep the parameter type as IndexedRecord like rewriteRecordWithNewSchema, avoiding unnecessary type coercion in HoodieAvroRecord.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
|
thks for contributing, lgtm. |
Describe the issue this Pull Request addresses
This closes #17679
Move optimized method for shallow copy with new schema to utils and use it where applicable.
Summary and Changelog
Searched
HoodieAvroUtils.rewriteRecordWithNewSchemacalls and found one class with dead code and one class where function can be applied for prepending metadata.Impact
Performance improve
Risk Level
None
Documentation Update
None
Contributor's checklist