Skip to content

[Avro] Decoder panics on flush when schema contains map whose value is non-nullable #8253

@yongkyunlee

Description

@yongkyunlee

Describe the bug

When you try to decode data with schema like

{
  "name": "map_of_strings",
  "type": {
    "type": "map",
    "values": "string"
  },
  "doc": "Map with string values"
}

Decoder fails with the following error

InvalidArgumentError("column types must match schema types, expected Map(Field { name: \"entries\", data_type: Struct([Field { name: \"key\", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"value\", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }]), nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, false) but found Map(Field { name: \"entries\", data_type: Struct([Field { name: \"key\", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"value\", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }]), nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, false) at column index 0")

To Reproduce

You can reproduce with a simple unit test with the schema above.

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions