Skip to content

Performance Improvements for Large Bundle Validation #1595

@jordanpadams

Description

@jordanpadams

💡 Description

Addresses a cluster of performance and memory bottlenecks that cause validate to fail or run unacceptably slowly on large PDS4 bundles (23,000+ products) and table products with millions of records. Each sub-issue targets a specific hot spot identified through profiling and user-reported failures.

Sub-Issues

NASA-PDS/validate

NASA-PDS/pds4-jparser

Acceptance Criteria

  • Large bundles (23,000+ products) complete validation without OutOfMemoryError
  • Schematron XSLT compiled once per unique schematron, not once per label
  • No Pattern object allocated more than once per unique pattern string during field validation
  • Table products with millions of records validate using buffered I/O, not byte-at-a-time reads
  • 0xFF data bytes pass through RawTableReader correctly and are not treated as EOF

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

Status

ToDo

Status

In Progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions