-
Notifications
You must be signed in to change notification settings - Fork 0
Description
In a project I want to retrieve all data points of a device, or multiple devices, in one query interaction.
Currently this is possible by creating a query with concatenated OPTIONAL statements that each contain the exact Knowledge Interaction (KI) graph pattern as defined in the knowledge mappers. This is a very large query and has to be maintained if KIs change.
Using a more concise query, e.g.,
SELECT ?device ?datapoint
WHERE {
?device has:measurement ?datapoint
}
Does not work (as it does not match any graph pattern in a known KI in this project), unless you enable a higher reasoner_level.
A higher reasoner_level however will result in an out-of-memory error, due to a combinatory explosion, when above SPARQL query needs more triples in its WHERE, limiting this solution.
I propose that a separate endpoint function+API route, e.g., /explore or /full-sparql-syntax which
- queries for all data in a knowledge network (similar to the SPARQL query with OPTIONAL statements, but instead calling all known ASK knowledge interactions, result in a ?s ?p ?o of the graph patterns in these KIs)
- creates a rdflib knowledge graph from all gathered data in the knowledge network
- executes the SPARQL query, given in the API call, on this knowledge graph.
This 1) prevents the combinatory explosion with TKE reasoner, 2) prevents use and maintenance of the large SPARQL query concatenated with OPTIONALS 3) Allows use of the full SPARQL syntax as far as rdflib supports this (possibly later extended with (plugin) rdflib reasoners, SHACL validation, etc..)
Drawback is that querying for all data in a knowledge network could be prohibitively expensive, inefficient as unwanted data might also included, and it might be costly in terms of CPU and memory to construct the full graph in rdflib.
A further future addition proposed by @JackJackie is to detect upfront, e.g., based on the graph pattern used in the SPARQL query, whether it is more efficient to use the TKE reasoner or the proposed /explore route.
For example, if more than 3 triples contain an unresolved ?variable, or if there are more than 3 triples unknown in registered knowledge interactions, then use the explore route. Otherwise, if less than 3, use the reasoner.
I am not sure if the reasoner level can be adapted at runtime, for each query, and whether this is necessary. reasoner_level is now defined when building the docker images, but perhaps it is also possible during runtime, e.g., when calling the KI REST API .
Also based on earlier discussion with @bnouwt