Skip to content

Querying for non-KI graph patterns without reasoner #24

@kadevgraaf-tno

Description

@kadevgraaf-tno

In a project I want to retrieve all data points of a device, or multiple devices, in one query interaction.
Currently this is possible by creating a query with concatenated OPTIONAL statements that each contain the exact Knowledge Interaction (KI) graph pattern as defined in the knowledge mappers. This is a very large query and has to be maintained if KIs change.

Using a more concise query, e.g.,

SELECT ?device ?datapoint 
WHERE {
?device has:measurement ?datapoint
}

Does not work (as it does not match any graph pattern in a known KI in this project), unless you enable a higher reasoner_level.
A higher reasoner_level however will result in an out-of-memory error, due to a combinatory explosion, when above SPARQL query needs more triples in its WHERE, limiting this solution.

I propose that a separate endpoint function+API route, e.g., /explore or /full-sparql-syntax which

  • queries for all data in a knowledge network (similar to the SPARQL query with OPTIONAL statements, but instead calling all known ASK knowledge interactions, result in a ?s ?p ?o of the graph patterns in these KIs)
  • creates a rdflib knowledge graph from all gathered data in the knowledge network
  • executes the SPARQL query, given in the API call, on this knowledge graph.

This 1) prevents the combinatory explosion with TKE reasoner, 2) prevents use and maintenance of the large SPARQL query concatenated with OPTIONALS 3) Allows use of the full SPARQL syntax as far as rdflib supports this (possibly later extended with (plugin) rdflib reasoners, SHACL validation, etc..)

Drawback is that querying for all data in a knowledge network could be prohibitively expensive, inefficient as unwanted data might also included, and it might be costly in terms of CPU and memory to construct the full graph in rdflib.

A further future addition proposed by @JackJackie is to detect upfront, e.g., based on the graph pattern used in the SPARQL query, whether it is more efficient to use the TKE reasoner or the proposed /explore route.
For example, if more than 3 triples contain an unresolved ?variable, or if there are more than 3 triples unknown in registered knowledge interactions, then use the explore route. Otherwise, if less than 3, use the reasoner.
I am not sure if the reasoner level can be adapted at runtime, for each query, and whether this is necessary. reasoner_level is now defined when building the docker images, but perhaps it is also possible during runtime, e.g., when calling the KI REST API .

Also based on earlier discussion with @bnouwt

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions