Problem
hierarchy and relationships in the schema metadata serve overlapping purposes but are implemented separately, creating complexity and inconsistency.
A hierarchy level is conceptually a relationship with:
- Structural ordering (position in the type chain)
- A
resolvedParents resolution phase (post-parse link following)
- Optional
selfRef/selfRefField for same-type parent links
Relationships have:
- Embedding format hints (
heading, list, table, page)
- Matchers for heading text
fieldOn to control which side holds the link
Both support field, fieldOn, and multi. But hierarchy goes through resolveHierarchyEdges() post-parse, while relationships are resolved inline during extractEmbeddedNodes(). This split means:
- Hierarchy nodes can't use heading/list/table embedding patterns
- Relationships don't get
resolvedParents — making cross-node queries harder
parse-embedded has to maintain separate logic paths for hierarchy-typed nodes vs. relationship-typed nodes, even though they often appear in the same markdown structure
What to investigate / decide
-
Can hierarchy be modelled as a constrained subset of relationships? — Specifically, can we add ordering + resolvedParents semantics to relationships without losing the distinction that hierarchy imposes (strict type chain, depth inference, no arbitrary embedding)?
-
Should resolvedParents apply to relationship-typed nodes too? — If a node is embedded via a relationship (e.g. Application inside a Solution), it would be useful to have resolvedParents populated, just as hierarchy nodes do.
-
Where should the resolution phase live? — Currently resolveHierarchyEdges() is a separate post-parse pass only for hierarchy. If relationships also needed resolution, this pass would need to generalise.
-
What stays distinct? — Hierarchy's depth-based type inference and structural ordering are specific to the tree metaphor. These probably shouldn't bleed into general relationships. The goal is to reduce duplicated logic, not to collapse the concepts entirely.
Expected outcome
- Reduced duplication between
resolveHierarchyEdges, validate-hierarchy, and the relationship parsing/validation code
- A clearer mental model: hierarchy is the primary type chain; relationships are lateral or embedded associations, some of which also warrant parent resolution
parse-embedded can use a single traversal path that handles both, gated by which metadata is active
Related
- Companion issue: Refactor parse-embedded to use clean mdast traversal with explicit signal tracking
- This is likely a blocker for that refactor — the traversal design depends on how these two concepts relate
Problem
hierarchyandrelationshipsin the schema metadata serve overlapping purposes but are implemented separately, creating complexity and inconsistency.A
hierarchylevel is conceptually a relationship with:resolvedParentsresolution phase (post-parse link following)selfRef/selfRefFieldfor same-type parent linksRelationships have:
heading,list,table,page)fieldOnto control which side holds the linkBoth support
field,fieldOn, andmulti. But hierarchy goes throughresolveHierarchyEdges()post-parse, while relationships are resolved inline duringextractEmbeddedNodes(). This split means:resolvedParents— making cross-node queries harderparse-embeddedhas to maintain separate logic paths for hierarchy-typed nodes vs. relationship-typed nodes, even though they often appear in the same markdown structureWhat to investigate / decide
Can
hierarchybe modelled as a constrained subset ofrelationships? — Specifically, can we add ordering +resolvedParentssemantics to relationships without losing the distinction that hierarchy imposes (strict type chain, depth inference, no arbitrary embedding)?Should
resolvedParentsapply to relationship-typed nodes too? — If a node is embedded via a relationship (e.g.Applicationinside aSolution), it would be useful to haveresolvedParentspopulated, just as hierarchy nodes do.Where should the resolution phase live? — Currently
resolveHierarchyEdges()is a separate post-parse pass only for hierarchy. If relationships also needed resolution, this pass would need to generalise.What stays distinct? — Hierarchy's depth-based type inference and structural ordering are specific to the tree metaphor. These probably shouldn't bleed into general relationships. The goal is to reduce duplicated logic, not to collapse the concepts entirely.
Expected outcome
resolveHierarchyEdges,validate-hierarchy, and the relationship parsing/validation codeparse-embeddedcan use a single traversal path that handles both, gated by which metadata is activeRelated