Skip to content

Getting Started

Ksonar262 edited this page Apr 15, 2026 · 1 revision

Getting Started

This page walks you through cloning the template, setting up your Python environment, and verifying that everything is working.


Prerequisites

Before you begin, make sure you have the following installed:

Tool Version Notes
Python 3.10+ The template virtual environment uses Python 3.10
Git Any recent For cloning and version control
Docker 20.10+ Required for containerised training and the docs builder
Make Any Optional, used by the documentation build system

You don't need to install Docker to read or edit the documentation. Python and Git are the minimum requirements.


1. Clone the Repository

git clone https://github.com/GSTT-CSC/project-template.git
cd project-template

Users are advised to clone the whole repository and delete task subfolders which are not needed for their project.

2. Create and Activate a Virtual Environment

The template uses a Python virtual environment to isolate dependencies. Run these commands from the repository root:

python3.10 -m venv <your-env-name>
source <your-env-name>/bin/activate       # macOS / Linux
# OR
<your-env-name>\Scripts\Activate.ps1     # Windows PowerShell

3. Install Dependencies

Navigate into the task you want to work on (e.g. Classifier_2D) and install its requirements:

cd Project/Classifier_2D
pip install -r requirements.txt

Shared dependencies used across tasks are kept minimal, each task folder manages its own requirements.txt.


4. Configure the Project

Each task has a config/ folder containing .cfg configuration files. These control training hyperparameters, data paths, MLflow settings, and more. Open the relevant config file and update the paths for your environment:

# Example: Project/Classifier_2D/config/
├── train.cfg       # Training hyperparameters and data paths
└── tune.cfg        # Optuna hyperparameter search settings

See MLOps & Training Workflow for a full explanation of configuration options.


5. Verify the Setup

Run the test suite to confirm everything is installed correctly:

cd Project
python -m pytest tests/

A clean output with all tests passing confirms your environment is ready.


6. (Optional) Build the Documentation

The documentation/ folder uses a Dockerised Pandoc pipeline to render Markdown into PDF and DOCX compliance documents:

cd documentation
docker-compose up

Output documents are written to documentation/documents/. See Documentation & Regulatory Files for more detail.


Connecting to the DGX GPU Server

For training jobs on the shared DGX server, connect via SSH using PuTTY (Windows).

MLflow tracking is configured to point at the remote MLflow server. Update the mlflow_tracking_uri in your train.cfg before running jobs remotely.

MLflow server and XNAT server can either be local(spun up by docker-compose) or on the DGX.


Next: Directory Structure