Skip to content

feat(cimcheck): add Intellij plugin#15

Draft
spah-soptim wants to merge 64 commits into
cimcheckfrom
cimcheck-intellij
Draft

feat(cimcheck): add Intellij plugin#15
spah-soptim wants to merge 64 commits into
cimcheckfrom
cimcheck-intellij

Conversation

@spah-soptim

Copy link
Copy Markdown
Member

No description provided.

spah-soptim and others added 7 commits May 28, 2026 17:31
- intellij: default javaExecutable to "" so resolveJavaExecutable falls
  through to IntelliJ's bundled JBR. The previous "java" default made the
  JBR auto-detection dead code, breaking the plugin where "java" is not on
  PATH (e.g. Windows).
- intellij: document the deliberate ".ttl" file-type claim and its
  conflict/precedence implications in plugin.xml.
- core: substitute the SHACL $PATH/?PATH placeholder only as a whole token
  (negative look-ahead) and via Matcher.quoteReplacement, so variables like
  $PATHOLOGY are left untouched and '$' in the IRI is not mistaken for a
  back-reference.
- core: in validateAutoDetect, parse the query once with the analyzer's base
  URI and pass the parsed Query to the validator (new overload), removing a
  redundant re-parse and the probe/validate base-URI inconsistency.
- core: locate property-path-chain diagnostics near the preceding segment
  (locateNear) so the squiggle lands on the right occurrence.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Extract ValidationConfig interface so CliConfig/LspConfig cannot silently diverge
- Lift buildNamedGraphScope() into SparqlValidationApi to eliminate copy-paste between CLI and LSP
- Delegate CLI SchemaLoader.buildIndex() to CgmesSchemaLoader so unparseable files are
  skipped-with-warn rather than causing a hard failure
- Replace hand-rolled JSON serialisation in JsonFormatter with Jackson ObjectMapper
- Unify SHACL path walkers into a single walkPath(onUri, onAlternativeGroup) callback traversal
- Extract DefaultPrefixes.declaredPrefixes() and reuse it in SparqlTextDocumentService
  to eliminate the duplicate PREFIX_PATTERN field
- Add BY/IN/TO to SPARQL_STATEMENT_KEYWORDS and guard findBadKeywordLine() against
  built-in function calls (e.g. COALESCE())
- Remove dead SchemaIndex variable in definition() handler
- Extract positionFromMessage() helper to deduplicate SPARQL error-position parsing
- Reuse tokenLengthInSource() in turtleParseErrorDiagnostic() instead of an inline scan

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@spah-soptim spah-soptim self-assigned this Jun 11, 2026
@spah-soptim spah-soptim added the enhancement New feature or request label Jun 11, 2026
The repo-wide '*.jar' ignore also excluded gradle/wrapper/gradle-wrapper.jar,
so the wrapper jar was never committed and './gradlew' failed on a fresh
clone ("Unable to access jarfile gradle-wrapper.jar"). Add a negation rule
to re-include it and commit the regenerated wrapper.

Align the wrapper distribution to gradle-9.5.0 (a valid release) to match the
gradle-version pinned in the CI and release workflows.
Configure the IntelliJ Plugin Verifier (pluginVerification {} with
recommended() IDEs and a Marketplace-aligned failureLevel) and run it in
CI's build-intellij-plugin job. The verifier flagged two defects across the
2024.2–2026.1 IDE range that the prior "fixes" had only relocated:

- addBrowseFolderListener(Project, BrowseFolderActionListener) is deprecated
  in 242, scheduled-for-removal in 243-261, and REMOVED in 262 (262 reported
  a NoSuchMethodError compatibility problem). Replace with the stable
  addActionListener + FileChooser.chooseFile pair.
- PluginManager.findEnabledPlugin is @ApiStatus.Internal in 262. Drop it and
  read the plugin version from a build-time-generated classpath resource
  (cimcheck-plugin.properties, expanded by processResources) instead.

verifyPlugin now reports all 7 verified IDEs (IC-2024.2 through IU-262 EAP)
as Compatible with no deprecated/internal/scheduled-for-removal usages.
…pe registration

- Rewrite the plugin.xml Marketplace description with Features, Requirements,
  Getting started (.cgmes/validation.json), Settings, and Troubleshooting.
- Add cimcheck/intellij/README.md mirroring the VS Code docs (diagnostics
  table, validation.json reference, IntelliJ shortcuts, LSP4IJ log location).
- Implement SparqlFileType/ShaclFileType as plain FileType instead of
  LanguageFileType(PlainTextLanguage.INSTANCE). Binding the built-in
  PlainText language caused a file-type/language association conflict that
  could block extension registration (no highlighting / file not recognised).

verifyPlugin passes against all 7 IDEs (IC-2024.2 through IU-262 EAP).
…s override

The IntelliJ plugin never managed to launch the language server ("Cannot start
server: Unable to start language server: ProcessStreamConnectionProvider
[commands=[...java.exe, -jar, ...jar]]") even though the command was correct.

Cause: LSP4IJ's ProcessStreamConnectionProvider.start() reads the private
`commands` field directly, but the provider only overrode getCommands(). The
field stayed empty, so start() threw — while toString() (which uses the getter)
printed the correct command, masking the real problem.

Populate the command list and working directory via setCommands()/
setWorkingDirectory() in an init block. Also clarify in the plugin description
and README that LSP4IJ auto-installs only on a Marketplace install, not an
install-from-disk.

verifyPlugin passes against all 7 IDEs (IC-2024.2 through IU-262 EAP).
…etons

Plain FileType is not guaranteed to trigger the editorHighlighterProvider
pipeline — IntelliJ's highlighter factory expects LanguageFileType for the
provider lookup path to fire reliably. Introduce SparqlLanguage and
ShaclLanguage singletons (extending Language directly with unique IDs) so
each file type has its own Language, eliminating the PlainTextLanguage
conflict that caused the original revert while restoring LanguageFileType
and fixing syntax highlighting.
…ugin.xml

IntelliJ validates that the language= attribute matches what the
LanguageFileType instance returns; omitting it logs a SEVERE error and
aborts file type registration, breaking highlighting. With dedicated
SparqlLanguage/ShaclLanguage singletons the IDs are stable, so declaring
language="SPARQL" and language="SHACL" is both correct and required.
…ctory

editorHighlighterProvider is a low-level hook for custom EditorHighlighter
overrides; the standard IntelliJ highlighting pipeline for LanguageFileType
files goes through lang.syntaxHighlighterFactory instead. Replace the two
editorHighlighterProvider registrations with lang.syntaxHighlighterFactory
entries (one per language) backed by the new CimcheckSyntaxHighlighterFactory,
and remove the now-unused CimcheckEditorHighlighterProvider class.
…Mapping

LSP4IJ resolves the server by Language when the file type is a LanguageFileType;
fileTypeMapping only fires for plain FileType files. Since SparqlFileType and
ShaclFileType now extend LanguageFileType with dedicated Language singletons,
the server was no longer being activated for SPARQL/SHACL files. Replace the
two fileTypeMapping entries with languageMapping entries keyed on Language IDs
"SPARQL" and "SHACL".
…ine range

Without a ParserDefinition, IntelliJ has no PSI token structure for SPARQL/SHACL
files. When Ctrl is held, CtrlMouseHandler finds the entire file as the single PSI
element and underlines it all. Register minimal SparqlParserDefinition and
ShaclParserDefinition — flat parsers backed by the existing CimcheckLexer that
produce a token-level PSI tree without any real AST — so IntelliJ knows the
boundaries of each token and underlines only the one under the cursor.
Replace all hardcoded version strings with 0.0.0-SNAPSHOT / 0.0.0
placeholders. Two new scripts derive and apply versions from git tags:

- scripts/compute-version.sh <component>: emits X.Y.Z on a release tag
  push, or X.Y.Z-SNAPSHOT (patch+1 of last tag) on any other ref.
- scripts/set-versions.sh <cimxml-ver> [<cimcheck-ver>]: applies versions
  to all Maven poms, gradle.properties, and package.json in one call.

Release workflows no longer validate pom versions against tags; the
validate-tag jobs only check the tag format and extract the version.
CI snapshot-publish jobs compute the snapshot version from git before
deploying. The duplicated versions:set / SNAPSHOT-strip blocks and the
"TODO: remove once cimxml is hardcoded as a release version" workarounds
are removed.
…elease

Add a publishing token to build.gradle.kts (read from PUBLISH_TOKEN env
var) and a new publish-jetbrains-marketplace job in cimcheck-release.yml.

The job runs in the "JetBrains Marketplace Deployment" environment,
depends on validate-tag and publish-maven-central (gates on a valid
release), builds the LSP fat JAR + plugin, and calls gradle publishPlugin.
The marketplace token must be stored as the JETBRAINS_MARKETPLACE_TOKEN
secret in that environment.
… push

cimcheck: switch gh release create to --draft and update artifacts to
  cimcheck-core jar, CLI jar, VS Code vsix, and IntelliJ plugin zip
  (LSP jar removed — it is bundled inside the extensions, not a
  standalone download).

cimxml: add a new publish-github-release job that builds the cimxml jar
  and creates a draft GitHub Release, gated on publish-maven-central.

Both releases use --generate-notes and --draft so they can be reviewed
and published manually after the automated jobs complete.
Add a rename step that copies cli/lsp JARs to include the release version
in the filename before gh release create, so all artifacts have consistent
versioned names (cimcheck-X.Y.Z.jar, cimcheck-lsp-X.Y.Z.jar).

The copy happens after the extension-bundling steps that reference the
original fixed names, so no other build step is affected.
- cimcheck-cli: finalName cimcheck-cli-\${project.version}
- cimcheck-lsp: finalName cimcheck-lsp-\${project.version}
- VS Code vsix: vsce package --out cimcheck-vscode-X.Y.Z.vsix
- IntelliJ zip: already cimcheck-intellij-X.Y.Z.zip (unchanged)
- cimcheck-core jar: already cimcheck-core-X.Y.Z.jar (unchanged)

build.gradle.kts copyServerJar updated to match with a glob + rename so
it is version-independent. CI and release workflows updated to reference
the new names; the post-build rename step in the release workflow is
removed.
…maven-central

verifyPlugin downloads IDE distributions and is slow + disk-heavy; removed it
from build-intellij-plugin. For both cimcheck and cimxml releases, the GitHub
Packages, GitHub Release, and JetBrains Marketplace jobs now use
`if: always() && ...` so they proceed even when publish-maven-central fails.
The build-vsix and build-intellij-plugin jobs were building with the
placeholder version (0.0.0), causing CI artifacts to be named
cimcheck-vscode-0.0.0.vsix and cimcheck-intellij-0.0.0.zip.

Add fetch-tags: true and an "Apply snapshot versions" step to both
jobs so compute-version.sh can resolve the last release tag and
set-versions.sh propagates the snapshot version into gradle.properties
and package.json before the builds run.
- analysis: drop duplicate ClassRefKey/PropertyRefKey records; dedup sets now
  hold ClassReference/PropertyReference directly
- core: extract shared IriFormat util (shortIri/appendIris), removing the copy
  duplicated in SparqlQueryValidator and ShaclShapeAnalyzer
- SparqlQueryValidator: collapse validateReferences' 10 params into an
  AnalysisRefs record; merge formatClass/PropertyMessage into one
  formatMissingTermMessage; use the imported PrefixMapping
- SparqlValidationApi: add getUpdateProfileDependencies(String, Map) and
  getGraphDependencies(String, Collection) overloads for API symmetry
- lsp: switch volatile LanguageClient client to AtomicReference in
  SchemaManager and SparqlTextDocumentService
- lsp: delete unused DiagnosticConverter and fix dangling javadoc link
- tests: cover the previously-untested dependency overloads
…ndpoint

Validate SPARQL cells inside VSCode SPARQL Notebooks, not just .rq/.sparql
files, and honour the notebook "# [endpoint=...]" directive as a schema source.

- vscode client: add the vscode-notebook-cell scheme to the document selector
  and an onNotebook:sparql-notebook activation event so cells reach the server.
- server: augment file-extension detection with languageId-based classification
  (captured at didOpen, since didChange carries no languageId) so extension-less
  notebook cell URIs are recognised; IntelliJ/file behaviour is unchanged.
- server: resolve the per-cell schema from the endpoint directive — a local
  .ttl/.rdf/.owl file is loaded via the existing CgmesSchemaLoader and cached by
  resolved path; absent directive falls back to .cgmes/validation.json. Remote
  SPARQL endpoints are recognised but not yet loaded.
A notebook "# [endpoint=https://…]" directive now loads the schema from the
endpoint itself, where the CGMES profiles live in per-profile named graphs.

- core: CgmesSchemaLoader.indexFromGraphs(Iterable<Graph>) wraps each graph as a
  CimProfile and builds an RdfsSchemaIndex, skipping non-profile/duplicate graphs.
  Re-asserts the cim namespace prefix (detected from the graph's IRIs) before
  wrapping, since profile detection is prefix-based and SPARQL CONSTRUCT results
  arrive without prefixes.
- lsp: EndpointGraphFetcher enumerates schema named graphs (rdfs:Class/owl:Ontology,
  excluding instance data) and CONSTRUCTs each via Jena QueryExecutionHTTP with a
  per-query timeout.
- lsp: SchemaManager loads remote endpoints asynchronously on the schema-loader
  thread (never blocking the validator), caches by endpoint, negative-caches
  failures, and revalidates open documents once the schema lands.
spah-soptim and others added 13 commits June 12, 2026 23:42
…/Turtle

When a SHACL/Turtle document's schema cannot be resolved (e.g. an unreachable
# [endpoint=...]), the LSP previously published no diagnostics, silently hiding
broken shapes. Mirror the existing SPARQL fallback: report Turtle parse errors
and syntax-check embedded SPARQL fragments, skipping schema-dependent checks.

- core: add static SparqlValidationApi.checkShaclSyntaxOnly(Graph), the SHACL
  counterpart of checkSyntaxOnly(String)
- lsp: extract parseTurtle helper; route the unresolved-schema + endpoint branch
  of validateShacl through new publishShaclSyntaxOnly
- test: ShaclSyntaxOnlyValidationTest covering valid/broken fragments, sh:prefixes,
  and the no-schema/no-fragment cases
… detection

Move SPARQL-endpoint schema loading out of the LSP and into the core library so
it is usable headlessly (CLI/pipeline), and add automatic detection of which
instance named graph holds which CGMES profile.

Core (de.soptim.opencgmes.cimcheck.core.schema):
- SparqlGraphSource abstraction with HttpSparqlGraphSource (moved from the LSP
  EndpointGraphFetcher, keeping the Fuseki update→query 405 sibling fallback) and
  DatasetSparqlGraphSource for in-process/test use.
- NamedGraphProfileResolver: classifies each instance graph to its dominant
  profile by discriminating-term coverage (terms declared by exactly one
  profile), sampling predicates + rdf:type objects only — it never validates
  instance data.
- EndpointSchema + EndpointSchemaLoader: one call fetches the schema graphs,
  builds the index, and auto-maps instance graphs to profiles. An endpoint that
  exposes no CIM schema yields EndpointSchema.noSchema() rather than throwing.

LSP: SchemaManager.loadRemoteEndpoint now uses EndpointSchemaLoader, stores the
auto-detected per-graph scope in ResolvedSchema, and shows visible warnings when
an endpoint has no schema (falls back to syntax-only) or when a graph could not
be classified. Old LSP EndpointGraphFetcher removed.

CLI: new --endpoint/-e (auto schema + graph mapping) and --strict-endpoint
(no-schema → exit 2 instead of the default warn + syntax-only).

Verified end-to-end against a live Fuseki: graphs auto-mapped, per-graph scoped
validation flags a class only valid in another profile, and both no-schema
branches behave as designed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…OT_CONFIGURED)

CGMES queries routinely navigate the RDFS schema directly, e.g.
  GRAPH <cgmes:schema:EQ> { ?t rdfs:subClassOf* cim:Switch } .
Endpoint auto-detection excluded schema graphs from the per-graph scope, so such
queries were flagged GRAPH_NOT_CONFIGURED even though they are valid — they read
the meta-schema, not instance data.

EndpointSchemaLoader now maps every detected schema graph to all profiles, so CIM
terms used inside a schema graph validate permissively (cim:Switch resolves; a
typo is still caught) instead of producing a spurious warning. EndpointSchema
gains instanceGraphsMapped() so the LSP/CLI status messages still report the
instance-graph count separately from the schema graphs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ects

The implied-type INFO hint ("?x has no explicit rdf:type but its use of <P>
implies <C>") only counts a concrete rdf:type as "typed", so it fired on the
canonical CGMES idiom

    GRAPH <schema> { ?type rdfs:subClassOf* cim:Switch }
    GRAPH <model>  { ?sw a ?type ; cim:IdentifiedObject.name ?name }

where ?sw is clearly typed — just dynamically. The hint there is pure noise.

SubjectTypeInference.subjectsWithVariableType() collects subjects that carry a
variable type assertion (?s a ?var); SemanticChecks now skips the implied-type
hint for those, while still emitting it for genuinely untyped variables.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…x-only fallback

A misspelt SHACL term (e.g. sh:taaargetClass, sh:dataype) was reported by the
full SHACL validation path but NOT when the schema could not be resolved — e.g.
when a # [endpoint=...] points at an unreachable / non-query endpoint, which
makes the editor fall back to checkShaclSyntaxOnly(). That fallback only did a
Turtle parse + embedded-SPARQL syntax check and returned no shape annotations,
so the typo went silently unreported in VS Code.

The vocabulary-typo check is schema-independent (it consults only the bundled
W3C sh/rdf/rdfs/owl vocabularies), so it can and should run without a schema:

- ShaclShapeAnalyzer.checkVocabularyOnly(graph): new schema-free entry point;
  the underlying vocab check is now static.
- SparqlValidationApi.checkShaclSyntaxOnly: includes those vocab findings in
  shapeAnnotations.
- LSP publishShaclSyntaxOnly and CLI validateSyntaxOnly: render the shape
  annotations (they previously dropped everything but embedded-SPARQL results).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ling

A # [endpoint=...] pointing at a Fuseki operation endpoint other than query —
e.g. the workshop notebook's SHACL cells use .../svedala/shacl?graph=cgmes:model:SSH
(the server-side SHACL validator), the same URL the execution tooling needs —
could not load a schema: cimcheck reads the schema via SPARQL, which /shacl does
not serve. Now such endpoints transparently resolve to the dataset's /query
service for schema loading, so one directive works for both execution and
static validation.

Changes to HttpSparqlGraphSource:
- Strip any URL query string (?graph=...) up front — meaningless to the
  ASK/SELECT/CONSTRUCT schema queries and a source of routing confusion.
- Generalize queryEndpointSibling: map the trailing Fuseki operation segment
  update/shacl/data/get/upload -> query (was update-only). query/sparql and a
  bare dataset URL are left as-is.
- Probe the given URL first (so a dataset literally named shacl/update is
  honoured), then fall back to the /query sibling on failure.
- Fix the caught exception type: Jena's QueryExecutionHTTP throws
  org.apache.jena.sparql.engine.http.QueryExceptionHTTP, NOT
  org.apache.jena.atlas.web.HttpException — the old catch never fired, so even
  the original /update fallback was a no-op live (only its string-helper unit
  test passed). The fallback now actually triggers.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…o syntax-only

When a # [endpoint=...] document's schema can't be resolved (endpoint failed, or
still loading) the editor already validated syntax-only and showed a bottom-right
notification, but the document itself gave no in-line signal that semantic checks
were skipped. Add a first-line WARNING diagnostic ("Schema could not be loaded
from the endpoint — only syntax is checked …") in both the SPARQL
(publishSyntaxOnly) and SHACL (publishShaclSyntaxOnly) fallbacks. It clears
automatically once the schema loads and the document is re-validated.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… schema

Previously only validation honoured a # [endpoint=...] directive; hover,
completion and go-to-definition all read the workspace/bundled schema. So an
endpoint cell validated against Fuseki but showed hover text, completions and
"go to definition" from the bundled CGMES 3.0 — the wrong schema — or nothing
when the term wasn't in the bundled schema at all.

- Hover & completion now resolve the document's schema the same way validation
  does (endpoint schema when declared and resolved, else workspace). The
  endpoint-fetched index already carries rdfs:label/rdfs:comment, so descriptions
  are correct. When an endpoint is declared but not yet resolved, nothing is shown
  rather than wrong-schema info.
- Go-to-definition: for an endpoint term there is no local file, so
  EndpointDefinitionPeek fetches the term's triples from the endpoint
  (HttpSparqlGraphSource.fetchResource), renders them as a cached read-only
  Turtle file, and returns a file:// Location at the term's line. Works for any
  SPARQL endpoint and uniformly across LSP clients (VS Code, IntelliJ). It no
  longer jumps to the misleading bundled file.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
spah-soptim and others added 2 commits June 15, 2026 14:28
The first-line "schema could not be loaded from the endpoint — only syntax is
checked" diagnostic is now ERROR (was WARNING), so an unreachable/failed endpoint
is clearly surfaced rather than blending in with advisory warnings.
…tribution

There is no longer a bundled default schema. Validation gets its schema from an
opencgmes.json ("schemas"/"schemasDirectory") or a "# [endpoint=...]" directive;
with neither, CIMcheck performs a syntax-only check.

Core:
- Delete BundledSchemas (+ test) and CgmesSchemaLoader.bundledDefault().
- Delete the vendored CGMES 3.0 RDFS resources under resources/cgmes/.
- ConfigTemplate: scaffold now documents pointing at your own schemas.

LSP:
- schema.SchemaLoader.loadWithSources returns Optional (empty = no schemas
  configured); drop loadBundledWithSources and the bundled fallback.
- SchemaManager: no config / config-without-schemas → no workspace schema
  (apiRef null) instead of the bundled default; BUNDLED sentinel removed.
- Document service: no resolved schema → syntax-only. The first-line "schema
  could not be loaded" error notice is shown only for the endpoint case; the
  plain no-config case is quietly syntax-only.

CLI:
- schema.SchemaLoader.load(config) returns Optional; drop loadBundled.
- ValidateCommand/ExplainCommand: no --schema/--config/--endpoint (or a config
  without schemas) → syntax-only ("Info: no schema configured…"). InitCommand
  messaging updated.

Docs / IntelliJ: core/vscode/intellij READMEs, plugin.xml and CreateConfigAction
reworded for the no-bundled-default behavior (server-JAR/JBR "bundled" wording
left intact). vscode README's stale "endpoint features = diagnostics only"
limitation corrected (hover/completion/go-to-def are endpoint-aware now).

W3C vocab attribution: the rdf/rdfs/owl/shacl vocabularies under resources/vocab
(used for standard-vocabulary term checking) are W3C works — added vocab/NOTICE.md
with source URLs, © W3C, and the W3C Software and Document License, and referenced
it from the READMEs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@spah-soptim spah-soptim had a problem deploying to Maven Central Deployment June 15, 2026 13:50 — with GitHub Actions Failure
@spah-soptim spah-soptim had a problem deploying to JetBrains Marketplace Deployment June 15, 2026 13:50 — with GitHub Actions Failure
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant