Is your feature request related to a problem? Please describe.
Currently there is no automated way to convert free-form text into a structured GEST (scene graph) representation in mta-sim. This limits the ability to leverage textual data for scenario generation, annotation, or downstream processing that relies on formal scene graphs. Manual conversion is time-consuming and can introduce inconsistencies.
Describe the solution you'd like
Implement a text-to-GEST parser that:
- Accepts free-form or semi-structured text as input.
- Uses formal GEST rules (to be defined based on the GEST paper) to parse and extract entities, relationships, actions, and temporal information.
- Outputs a structured JSON representation of the scene graph (GEST).
- Clusters the text by regions/locations, integrating these segments into the final graph while preserving temporal relationships.
- Is modular and extensible to accommodate updates to GEST rules or new entity types.
Describe alternatives you've considered
- Manual annotation and conversion (not scalable)
- Heuristic-based scripts with no formal rule integration (less robust and error-prone)
Acceptance Criteria
Additional context
- GEST rule specification will be extracted separately from the GEST paper.
- Consider integration with the story teller system to generate text, GEST pairs
Is your feature request related to a problem? Please describe.
Currently there is no automated way to convert free-form text into a structured GEST (scene graph) representation in mta-sim. This limits the ability to leverage textual data for scenario generation, annotation, or downstream processing that relies on formal scene graphs. Manual conversion is time-consuming and can introduce inconsistencies.
Describe the solution you'd like
Implement a text-to-GEST parser that:
Describe alternatives you've considered
Acceptance Criteria
Additional context