Skip to content

Investigate enum support in Zerobus to simplify "datagen" #3

@newfront

Description

@newfront

Currently, spark will automatically cast an enum to a StringType, this is due to the lack of enumeration support within Spark and in our case, Delta Lake. There is an option for enums_as_ints.

deserialized_df = df.select(from_protobuf(
    df["protobuf_data"],
    messageName=message_name,
    descFilePath=descriptor_path,
    options={"enums.as.ints": "True"} 
).alias("person_data"))

If this works inside of the spark engine then we need to figure out if that option is also available as a table property for a zerobus table.

Metadata

Metadata

Assignees

No one assigned

    Labels

    help wantedExtra attention is neededquestionFurther information is requested

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions