Skip to content

JGI: Provide Data in Ingest Directory and Organize ETL Scripts #12

@vchendrix

Description

@vchendrix

Implement JGI data ingest per standardized requirements:

  • Create an ingest/jgi subfolder.
  • Place all JGI data files in this directory, formatted as JSON lists (enclosed in brackets) and named using the convention jgi_00001.json, jgi_00002.json, etc.
  • Ensure each file is limited to ~25 MB and contains only complete records.
  • All files must conform to the latest release schema.
  • Document and implement a file splitting strategy if necessary.
  • All ETL scripts for JGI should be placed in contrib/jgi.
  • Ensure independent file validation is possible.
  • Document the JGI ingest process, folder structure, file format, and splitting strategy.

Metadata

Metadata

Labels

documentationImprovements or additions to documentationenhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions