Note
This repository is under active development. Features, documentation, and structure will change frequently.
A Python module for converting Swiss plant protection product data from CSV format to RDF and publishing it to the LINDAS Linked Data Service.
-
Set up the virtual environment
If
uvis not yet installed:curl -LsSf https://astral.sh/uv/install.sh | shuv venv psmv-rdf source psmv-rdf/bin/activate -
Install the package in editable mode
uv pip install -e . -
Start the data integration pipeline
python -m service.pipeline
-
To upload the graph, first, place a
.envfile in the directory root:LINDAS_USER=******** LINDAS_PASSWORD=************ ENDPOINT=https://graphdb.lindas.admin.ch/repositories/lindas/rdf-graphs/service GRAPH=https://lindas.admin.ch/fsvo/plant-protection-products
Then trigger the upload to LINDAS:
python -m service.upload_graph
psmv-rdf/
├── .github/
├── README.md
├── data/ # any non-RDF data files, used as input data
│ ├── raw/ # input CSV files
│ └── mapping/ # yaml mapping files
├── services/
│ └── pipeline.py
├── src/
│ ├── sparql # SPARQL queries and inference rules
│ └── python/ # Python scripts for specific tasks
├── rdf/
│ ├── ontology/ # OWL ontology documentation
│ ├── shapes/ # SHACL shapes, also used as data model documentation
│ ├── data/ # the actual RDF data, split by classes
│ ├── example/ # example turtle files used for reference
│ └── processed/ # any automatically written/derived/merged turtle files
├── tests/
├── docs/ # project documentation
├── LICENSE
└── .gitignoreAll ontology documentation files are written to rdf/ontology.
You may inspect a visual representation of the ontology used here.
A more restricted data model is written in SHACL and can be inspected here.
Project dependencies are listed in pyproject.toml.
- Built with rdflib
- Integrates with LINDAS, the Swiss federal linked data service.
- Orignial ontology and pipeline by Damian Oswald with plant protection pipeline