Skip to content

refactor(plantuml): replace linker with idmap-based cross-diagram linking#290

Open
AAmbuj wants to merge 7 commits into
eclipse-score:mainfrom
AAmbuj:amsh_plantuml_linker_review_refactor
Open

refactor(plantuml): replace linker with idmap-based cross-diagram linking#290
AAmbuj wants to merge 7 commits into
eclipse-score:mainfrom
AAmbuj:amsh_plantuml_linker_review_refactor

Conversation

@AAmbuj

@AAmbuj AAmbuj commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Replace the separate linker tool with idmap sidecar files emitted directly by the parser.

Key changes:

  • Add puml_idmap crate: emits *.idmap.json sidecars with defines/references roles detected from diagram structure (component children, class methods/ variables, sequence participants)
  • Extend puml_cli with --source-name and --idmap-output-dir args; use the workspace-relative path identity instead of the basename for a stable, path-unique source identifier
  • Rewrite clickable_plantuml Sphinx extension: load idmaps at builder-inited, resolve references via FQN→alias lookup with a proximity tiebreak and a tie→no-link guard; emit correct relative URLs for svg_obj mode; and percent-encode injected PlantUML URLs
  • Update architectural_design.bzl: parser now emits 3 outputs (fbs, lobster, idmap); remove linker action and _linker attr
  • Delete plantuml/linker (replaced by idmap approach)

@AAmbuj AAmbuj force-pushed the amsh_plantuml_linker_review_refactor branch from dbb35d4 to 6b7cf91 Compare June 24, 2026 09:59
@AAmbuj AAmbuj self-assigned this Jun 24, 2026
@AAmbuj AAmbuj force-pushed the amsh_plantuml_linker_review_refactor branch from 6b7cf91 to 024b9b9 Compare June 24, 2026 10:06

@hoe-jo hoe-jo left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, looks promising, some minor topics:

  1. Remove basename collision fallback; make source-path matching exact

Problem: extension joins idmap source (workspace short path from --source-name) to the Sphinx plantuml node path; when they differ under Bazel it falls back to idmap_by_basename with first-seen-wins, so two proxy.puml in different packages silently mislink.

Define one canonical key = workspace-relative POSIX path; normalize idmap source and node filename into it. Determine the staging→workspace offset once in on_builder_inited from app.srcdir.

Delete idmap_by_basename, _resolve_idmap fallback, and _ENV_IDMAP_BY_BASENAME.
If two idmaps share a key, raise build error (not warning). If a node can't map to a key, skip + warn.
Acceptance: two same-basename diagrams in different dirs each link only to their own definer; duplicate keys fail the build.

  1. Fix proximity tiebreak path namespace
    Problem: _proximity_tiebreak/_common_prefix_length may compare a staging source key against workspace-relative candidates. Once P0-1 lands, all are canonical keys — verify and add an assertion. Acceptance: tiebreak only on canonical keys; tie still → no link + warning.

  2. rust_test for puml_idmap
    Add #[cfg(test)] + rust_test in BUILD. Cover: component children/none, class members, sequence participants, Empty, alias fallback, deterministic sort.

  3. pytest for clickable_plantuml
    New tests/ + py_test. Cover: index build, FQN-before-alias, self-link skip, single vs tie, svg_obj vs svg URL, percent-encode, anchor.

5: _escape_plantuml_url — # in _SAFE can break [[ ]]; verify one-URL-per-alias dedup matches old contract.
6: confirm DefaultInfo idmap leak into consumer srcs intended; else split provider.
7: README invariant: --source-name must equal srcdir-relative path; unique basenames.

@AAmbuj AAmbuj force-pushed the amsh_plantuml_linker_review_refactor branch 2 times, most recently from 7662a39 to 2ee3857 Compare July 1, 2026 06:42
@AAmbuj AAmbuj marked this pull request as draft July 1, 2026 07:23
@AAmbuj AAmbuj force-pushed the amsh_plantuml_linker_review_refactor branch 8 times, most recently from 19a3445 to 8cae034 Compare July 2, 2026 09:06
@AAmbuj

AAmbuj commented Jul 2, 2026

Copy link
Copy Markdown
Contributor Author

What changed

  • Replaced the old standalone PlantUML linker flow with an idmap-based cross-diagram linking pipeline.
  • Removed the linker implementation and migrated link-data generation into parser outputs.
  • Added a new Rust idmap module and wired parser/build targets to consume it.
  • Refactored clickable_plantuml to resolve links from canonical source-path keys (instead of basename-based matching).
  • Added path-aware definer resolution (FQN-first, alias fallback), ambiguity tiebreaking, and safer URL escaping/injection behavior.
  • Updated Bazel rules (architectural_design, unit_design, fmea, dependable_element, sphinx_module) to produce/stage/pass idmap sidecars correctly.
  • Added new Rust and Python test coverage for idmap generation and clickable link resolution.
  • Updated rule and architecture docs to reflect the new idmap pipeline and invariants.

Why

  • Basename-based matching could silently mislink when diagrams shared names across directories.
  • The old linker phase duplicated responsibility and increased complexity.
  • Canonical path matching + idmap sidecars provides deterministic cross-diagram linking and better scalability.
  • The new structure improves maintainability and keeps parser/link semantics in one coherent data pipeline.

@AAmbuj

AAmbuj commented Jul 2, 2026

Copy link
Copy Markdown
Contributor Author

Test:

Unit tests (Python):

  • clickable_plantuml index build/loading
  • duplicate-key handling
  • FQN-before-alias resolution
  • self-link skip
  • single candidate vs tie behavior
  • svg_obj vs svg URL mode handling
  • percent-encoding and anchor handling
  • one-URL-per-alias contract

Unit tests (Rust):

  • component/class/sequence/empty idmap generation
  • alias fallback behavior
  • deterministic ordering/sorting

Bazel targets validated:

  • bazel test //plantuml/sphinx/clickable_plantuml/tests:test_clickable_plantuml
  • bazel test //plantuml/parser/puml_idmap:puml_idmap_test
  • bazel test //plantuml/parser/puml_cli:puml_cli_test

@AAmbuj AAmbuj marked this pull request as ready for review July 2, 2026 09:29
@AAmbuj AAmbuj requested a review from hoe-jo July 2, 2026 09:29

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change the per-node "no resolvable source path" logger.warning to logger.debug. Diagrams that aren't idmap-backed are a normal case, not an error.
Optionally add one aggregated logger.info at end of build: "N plantuml nodes had no idmap (not linkable)".
Test: add a case with a plantuml node absent from source_keys and assert no WARNING record is emitted (via caplog).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert to symlink for all files and instead make _node_source_key robust to symlinks by using os.path.realpath on both the node path and workspace_offset before prefix-matching. normpath doesn't resolve symlinks, which is the actual reason real files were needed — fixing it in Python removes the need to copy anything.

Comment thread plantuml/parser/puml_idmap/src/lib.rs Outdated
let output_path = output_dir.join(format!("{file_stem}.idmap.json"));

let json = serde_json::to_string_pretty(&idmap)
.map_err(|e| io::Error::new(io::ErrorKind::Other, e))?;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

io::Error::other(e)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make staging identity-preserving:
Stage .puml/.idmap.json at srcdir/<short_path> (i.e. source_prefix="", no strip_prefix for these), so staged-relative == short_path == idmap source.
Then keep symlinks, drop the sed rewrite, and drop execroot surgery

@AAmbuj AAmbuj force-pushed the amsh_plantuml_linker_review_refactor branch 3 times, most recently from a707cb0 to 5b7aa6a Compare July 3, 2026 07:27
@AAmbuj AAmbuj force-pushed the amsh_plantuml_linker_review_refactor branch from 5b7aa6a to 91d64c7 Compare July 3, 2026 07:29
AAmbuj added 7 commits July 3, 2026 13:15
The linker binary parsed FlatBuffers output to produce plantuml_links.json.
It is superseded by the idmap-based approach where each diagram emits its
own .idmap.json sidecar, making a separate link-resolution step unnecessary.
Replace all references to plantuml_links.json / standalone linker with the
idmap sidecar flow. tooling_chain.puml updated to reflect the new pipeline.
Emits a per-source .idmap.json with defines[] / references[] entries used
by the clickable_plantuml Sphinx extension to resolve cross-diagram links.
Supports Component, Class, Sequence, Activity (empty), and FTA diagrams.
Calls write_idmap_to_file() for every resolved diagram. Passes short_path
as source_name so both idmap and lobster outputs use the stable
workspace-relative path rather than the sandbox-local filesystem path.
- _load_idmap_files() indexes all .idmap.json sidecars at build-finished time
- _resolve_definer() uses FQN-first + proximity tiebreak to find the defining source
- _node_source_key() maps each plantuml node back to its idmap source key
- Removed: _load_link_mappings(), _collect_link_data() (linker-era code)
- architectural_design.bzl, unit_design.bzl: _run_puml_parser now returns
  (fbs, lobster, idmap); all callers updated
- fmea.bzl: idmap sidecars collected from root_causes; add duplicate-basename
  guard (fail-fast with actionable error message)
- sphinx_module.bzl: identity-preserving symlink staging for .puml and
  .idmap.json so source keys remain stable under any strip_prefix config
Covers: _load_idmap_files (index building, key conflicts), _node_source_key
(symlink resolution, suffix matching, tie-breaking), _resolve_definer
(FQN priority, alias fallback, self-link skip, warning on tie),
_build_target_url, _escape_plantuml_url, _inject_links, and
on_doctree_resolved end-to-end integration.
@AAmbuj AAmbuj force-pushed the amsh_plantuml_linker_review_refactor branch from 91d64c7 to 8f4f83b Compare July 3, 2026 07:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants