-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Emmanuel S. edited this page Apr 15, 2026
·
20 revisions
Goal : This project should be replaced by Geoplateforme API calls providing a keyword-based search engine for collections (like https://www.data.gouv.fr/api/1/datasets/?q=ecole&page=1&page_size=20) and detailled schemas for collections (like Table Schema)
- Scrap GetCapabilities and DescribeFeatureType from https://data.geopf.fr/wfs
- Allow overwrites and completions (
src/data/BDTOPO_V3/batiment.(json|yaml)with bdtopoexplorer.ign.fr - batiment source data) - Ensure that it improves search at MCP level (see experiment with MiniSearch on a branch)
- #4 - Improve data management to ease change detection and overwrite updates (create unique file for each WFS FeatureType)
- #5 - Review available data on data.geopf.fr and improve filtering to keep only relevant ones (remove gpf publication test datasets, local data,...)
- #6 - Integrate the lightweight search engine (
search(q: string)) based on MiniSearch from the MCP ignfab/geocontext) - #8 - Improve logging to avoid problem in the MCP
- #6 Add functional tests for the search :
- query: "bâtiment"
expected: ["BDTOPO_V3:batiment","BDCARTO_V5:batiment"]
...Use more WFS infos :
- #6 Define a first working strategy for the search to match expectations (ponderate between datasets, ...)
- Retrieve keywords from DescribeFeatureType
- Parse namespace to extract version (
"ADMINEXPRESS-COG.2026"->{"version": "2026"}) - #16 Gather more internal metadata. Revisit the first naive metadata extraction.
Integrate validation schemas :
- #14 for https://www.geoportail-urbanisme.gouv.fr/ ( wfs_sup / wfs_du / wfs_scot )
- #17 Retrieve relevant informations from ISO 19115 metadata from https://data.geopf.fr/csw
⚠️ Tables schema not included in these metadata⚠️
=>ISO 19115 is not trivial!
- See IGNF/validator - doc/metadata.md for the model
- See MetadataURL in GetCapabilities for the links :
curl -sS "https://data.geopf.fr/wfs?SERVICE=WFS&VERSION=2.0.0&REQUEST=GetCapabilities" | xmllint --format - | grep MetadataURLUse abstractions to prepare replacement?
-
SearchEngine<-MinisearchSearchEngine/GpfSearchEngine( client for https://data.geopf.fr/recherche/api/indexes/geoplateforme ) -
CollectionStore<-LocalCollectionStore(with overwrites) /GpfApiCollectionStore( client for the OGC API Feature )
- Use an existing metamodel (Table Schema or IGN Validator) instead of src to align with validation requirements (not required for now as an LLM doesn't parse data and doesn't care about model changes)
- Illustrate the expected service at Géoplateforme level with a Lightweight REST API :
- Get all collections (
/api/collections) - too fat for an LLM (seen on GeoServer implementation) - Get collections by id (
/api/collections/{id}) - required to allow the MCP to query features - Search collection (
/api/collections/search?q={text}) - required to allow the MCP to find data - Get collections by namespace (aka serie) (
/api/collections?namespace=BDTOPO_V3) - not required for MCP
- Get all collections (
- Input : data/wfs/{namespace}/{name}.json + document (HTML/PDF)
- Output : data/overwrite/{namespace}/{name}.json (that can be reviewed / completed)
- Display collection grouping by product (personal experiment is available here : https://www.quadtreeworld.net/geekeries/wfs-explorer/)
- Allow user to search collection with a form (as having to use a LLM based tool to find available data is not very eco-friendly...)