Skip to content

Latest commit

 

History

History
92 lines (69 loc) · 3.24 KB

File metadata and controls

92 lines (69 loc) · 3.24 KB

PSMV-RDF

Python License Status Docs CI

Note

This repository is under active development. Features, documentation, and structure will change frequently.

Plant Protection Products (PSMV) as Linked Data

A Python module for converting Swiss plant protection product data from CSV format to RDF and publishing it to the LINDAS Linked Data Service.

Reproduce the data integration pipeline

  1. Set up the virtual environment

    If uv is not yet installed: curl -LsSf https://astral.sh/uv/install.sh | sh

    uv venv psmv-rdf
    source psmv-rdf/bin/activate  
  2. Install the package in editable mode

    uv pip install -e .
  3. Start the data integration pipeline

    python -m service.pipeline
  4. To upload the graph, first, place a .env file in the directory root:

    LINDAS_USER=********
    LINDAS_PASSWORD=************
    ENDPOINT=https://graphdb.lindas.admin.ch/repositories/lindas/rdf-graphs/service
    GRAPH=https://lindas.admin.ch/fsvo/plant-protection-products

    Then trigger the upload to LINDAS:

    python -m service.upload_graph

Project Structure

psmv-rdf/
├── .github/
├── README.md
├── data/           # any non-RDF data files, used as input data
│   ├── raw/        # input CSV files
│   └── mapping/    # yaml mapping files
├── services/
│   └── pipeline.py       
├── src/
│    ├── sparql     # SPARQL queries and inference rules
│    └── python/    # Python scripts for specific tasks
├── rdf/
│   ├── ontology/   # OWL ontology documentation
│   ├── shapes/     # SHACL shapes, also used as data model documentation
│   ├── data/       # the actual RDF data, split by classes
│   ├── example/    # example turtle files used for reference
│   └── processed/  # any automatically written/derived/merged turtle files
├── tests/
├── docs/           # project documentation
├── LICENSE
└── .gitignore

Documentation

All ontology documentation files are written to rdf/ontology. You may inspect a visual representation of the ontology used here.

A more restricted data model is written in SHACL and can be inspected here.

Project dependencies are listed in pyproject.toml.

Acknowledgments