On-the-fly schema creation from json-ld to parquet #420
Unanswered
eldondevat
asked this question in
Q&A
Replies: 1 comment 1 reply
-
|
Currently, we have not implemented inferring a schema from JSON for arrow-go, if you can define some kind of schema beforehand and parse the JSON based on that schema then you can do it fairly easily in arrow-go. (creating record batches/table from JSON and writing a Parquet file from that table all exist). |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I am interested in processing some json-ld files on the fly through arrow-go to parquet files. I recently found a strategy from pyarrow to accomplish approximately what I'm hoping for: https://stackoverflow.com/a/67126122 . Is there a similarly trivial strategy with arrow-go, something that can infer a schema and output a parquet file on-the-fly? My primary interest here in a more compact data representation, and parquet files ended up being about 10% the size of my existing representation.
Beta Was this translation helpful? Give feedback.
All reactions