Currently, spark will automatically cast an enum to a StringType, this is due to the lack of enumeration support within Spark and in our case, Delta Lake. There is an option for enums_as_ints.
deserialized_df = df.select(from_protobuf(
df["protobuf_data"],
messageName=message_name,
descFilePath=descriptor_path,
options={"enums.as.ints": "True"}
).alias("person_data"))
If this works inside of the spark engine then we need to figure out if that option is also available as a table property for a zerobus table.
Currently, spark will automatically cast an enum to a
StringType, this is due to the lack of enumeration support within Spark and in our case, Delta Lake. There is an option forenums_as_ints.If this works inside of the
spark enginethen we need to figure out if that option is also available as atable propertyfor a zerobus table.