Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file modified .DS_Store
Binary file not shown.
34 changes: 34 additions & 0 deletions documentation/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# SimPaths Documentation

This documentation is structured to support both first-time users and contributors.

## Recommended reading order

1. [Getting Started](getting-started.md)
2. [CLI Reference](cli-reference.md)
3. [Configuration](configuration.md)
4. [Scenario Cookbook](scenario-cookbook.md)
5. [Data and Outputs](data-and-outputs.md)
6. [Troubleshooting](troubleshooting.md)

For contributors and advanced users:

- [Architecture](architecture.md)
- [Development and Testing](development.md)
- [GUI Guide](gui-guide.md)

## Scope

These guides cover:

- Building SimPaths with Maven
- Running single-run and multi-run workflows
- Configuring model, collector, and runtime behavior via YAML
- Understanding expected input/output files and generated artifacts
- Running unit and integration tests locally and in CI

## Conventions

- Commands are shown from the repository root.
- Paths are relative to the repository root.
- `default.yml` refers to `config/default.yml`.
44 changes: 44 additions & 0 deletions documentation/architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Architecture

## High-level module map

Core package layout under `src/main/java/simpaths/`:

- `experiment/`: simulation entry points and orchestration
- `model/`: core simulation entities and yearly process logic
- `data/`: parameters, setup routines, filters, statistics helpers

## Primary entry points

- `simpaths.experiment.SimPathsStart`
- Builds/refreshes setup artifacts
- Launches single simulation run (GUI or headless)
- `simpaths.experiment.SimPathsMultiRun`
- Loads YAML config
- Iterates runs with optional seed/innovation logic
- Supports persistence mode switching

## Runtime managers

The simulation engine registers:

- `SimPathsModel`: state evolution and process scheduling
- `SimPathsCollector`: statistics computation and export
- `SimPathsObserver`: GUI observation layer (when GUI is enabled)

## Data flow

1. Setup stage prepares policy schedule and input database.
2. Runtime model loads parameters and input maps.
3. Collector computes and exports statistics at scheduled intervals.
4. Output files are written to run folders under `output/`.

## Configuration flow

`SimPathsMultiRun` combines:

- defaults in class fields
- overrides from `config/<file>.yml`
- final CLI overrides at invocation time

This layered strategy supports reproducible batch runs with targeted command-line changes.
90 changes: 90 additions & 0 deletions documentation/cli-reference.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# CLI Reference

## `singlerun.jar` (`SimPathsStart`)

Usage:

```bash
java -jar singlerun.jar [options]
```

### Options

| Option | Meaning |
|---|---|
| `-c`, `--country <CC>` | Country code (`UK` or `IT`) |
| `-s`, `--startYear <year>` | Simulation start year |
| `-Setup` | Setup only (do not run simulation) |
| `-Run` | Run only (skip setup) |
| `-r`, `--rewrite-policy-schedule` | Rebuild policy schedule from policy files |
| `-g`, `--showGui <true/false>` | Enable or disable GUI |
| `-h`, `--help` | Print help |

Notes:

- `-Setup` and `-Run` are mutually exclusive.
- For non-GUI environments, use `-g false`.

### Examples

Setup only:

```bash
java -jar singlerun.jar -c UK -s 2019 -g false -Setup --rewrite-policy-schedule
```

Run only (after setup exists):

```bash
java -jar singlerun.jar -g false -Run
```

## `multirun.jar` (`SimPathsMultiRun`)

Usage:

```bash
java -jar multirun.jar [options]
```

### Options

| Option | Meaning |
|---|---|
| `-p`, `--popSize <int>` | Simulated population size |
| `-s`, `--startYear <year>` | Start year |
| `-e`, `--endYear <year>` | End year |
| `-DBSetup` | Database setup mode |
| `-n`, `--maxNumberOfRuns <int>` | Number of sequential runs |
| `-r`, `--randomSeed <int>` | Seed for first run |
| `-g`, `--executeWithGui <true/false>` | Enable or disable GUI |
| `-config <file>` | Config file in `config/` (default: `default.yml`) |
| `-f` | Write stdout and logs to `output/logs/` |
| `-P`, `--persist <root|run|none>` | Persistence strategy for processed dataset |
| `-h`, `--help` | Print help |

Persistence modes:

- `root` (default): persist to root input area for reuse
- `run`: persist per run output folder
- `none`: no processed-data persistence

### Examples

Create setup database using config:

```bash
java -jar multirun.jar -DBSetup -config test_create_database.yml
```

Run two simulations with root persistence:

```bash
java -jar multirun.jar -config test_run.yml -P root
```

Run without persistence and with file logging:

```bash
java -jar multirun.jar -config default.yml -P none -f
```
101 changes: 101 additions & 0 deletions documentation/configuration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
# Configuration

SimPaths multi-run behavior is controlled by YAML files in `config/`.

Examples in this repository include:

- `default.yml`
- `test_create_database.yml`
- `test_run.yml`
- `create database.yml`
- `sc analysis*.yml`
- `intertemporal elasticity.yml`
- `labour supply elasticity.yml`

For command-by-command guidance for each provided config, see [Scenario Cookbook](scenario-cookbook.md).

## How config is applied

`SimPathsMultiRun` loads `config/<file>` and applies values in two stages:

1. YAML values initialize runtime fields and argument maps.
2. CLI flags override those values if provided.

## Top-level keys

### Core run arguments

Common fields:

- `countryString`
- `maxNumberOfRuns`
- `executeWithGui`
- `randomSeed`
- `startYear`
- `endYear`
- `popSize`
- `integrationTest`

### `model_args`

Passed into `SimPathsModel` via reflection.

Typical toggles include:

- alignment flags (`alignPopulation`, `alignFertility`, `alignEmployment`, ...)
- behavioral switches (`enableIntertemporalOptimisations`, `responsesToHealth`, ...)
- persistence of behavioral grids (`saveBehaviour`, `useSavedBehaviour`, `readGrid`)

### `collector_args`

Controls output collection and export behavior (via `SimPathsCollector`), including:

- `persistStatistics`, `persistStatistics2`, `persistStatistics3`
- `persistPersons`, `persistBenefitUnits`, `persistHouseholds`
- `exportToCSV`, `exportToDatabase`

### `innovation_args`

Controls iteration logic across runs, such as:

- `randomSeedInnov`
- `intertemporalElasticityInnov`
- `labourSupplyElasticityInnov`
- `flagDatabaseSetup`

### `parameter_args`

Overrides values from `Parameters` (paths and model-global flags).

Common examples:

- `trainingFlag`
- `working_directory`
- `input_directory`
- `input_directory_initial_populations`
- `euromod_output_directory`

## Minimal example

```yaml
maxNumberOfRuns: 2
executeWithGui: false
randomSeed: 100
startYear: 2019
endYear: 2022
popSize: 20000

collector_args:
persistStatistics: true
persistStatistics2: true
persistStatistics3: true
persistPersons: false
persistBenefitUnits: false
persistHouseholds: false
```

Run it:

```bash
java -jar multirun.jar -config test_run.yml
```
56 changes: 56 additions & 0 deletions documentation/data-and-outputs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# Data and Outputs

## Data availability model

- Source code and documentation are open.
- Full research input datasets are not freely redistributable.
- Training data is included to support development, local testing, and CI.

## Input directory layout

Key paths:

- `input/`:
- regression and scenario Excel files (`reg_*.xlsx`, `scenario_*.xlsx`, `align_*.xlsx`)
- generated setup files (`input.mv.db`, `EUROMODpolicySchedule.xlsx`, `DatabaseCountryYear.xlsx`)
- `input/InitialPopulations/`:
- `training/population_initial_UK_2019.csv`
- `compile/` scripts for preparing initial-population inputs
- `input/EUROMODoutput/`:
- `training/*.txt` policy outputs and schedule artifacts

## Setup-generated artifacts

Running setup mode (`singlerun` setup or `multirun -DBSetup`) creates or refreshes:

- `input/input.mv.db`
- `input/EUROMODpolicySchedule.xlsx`
- `input/DatabaseCountryYear.xlsx`

## Output directory layout

Simulation runs produce timestamped folders under `output/`, typically with:

- `csv/` generated statistics and exported entities
- `database/` run-specific persistence output
- `input/` copied or persisted run input artifacts

Common CSV files include:

- `Statistics1.csv`
- `Statistics21.csv`
- `Statistics31.csv`
- `EmploymentStatistics1.csv`
- `HealthStatistics1.csv`

## Logging output

If `-f` is enabled with `multirun.jar`, logs are written to:

- `output/logs/run_<seed>.txt` (stdout capture)
- `output/logs/run_<seed>.log` (log4j output)

## Validation and analysis assets

- `validation/` contains validation artifacts and graph assets.
- `analysis/` contains `.do` scripts and spreadsheets used for downstream analysis.
61 changes: 61 additions & 0 deletions documentation/development.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# Development and Testing

## Build

Compile and package:

```bash
mvn clean package
```

## Tests

### Unit tests

Run unit tests (Surefire):

```bash
mvn test
```

### Integration tests

Run integration tests (Failsafe):

```bash
mvn verify
```

Integration tests exercise setup and run flows and compare generated CSV outputs to expected files in:

- `src/test/java/simpaths/integrationtest/expected/`

## CI workflows

GitHub workflows in `.github/workflows/` run:

- build and package on pull requests to `main` and `develop`
- integration tests (`mvn verify`)
- smoke runs for `singlerun.jar` and `multirun.jar` with persistence variants
- Javadoc generation and publish (on `develop` pushes)

## Javadoc

Generate locally:

```bash
mvn javadoc:javadoc
```

## Typical contributor flow

1. Create a feature branch in your fork.
2. Implement and test changes.
3. Run `mvn verify` before opening a PR.
4. Open a PR against `develop` (or `main` for stable fixes, when appropriate).

## Debugging tips

- Use `-g false` on headless systems.
- Use `-f` with `multirun.jar` to capture logs in `output/logs/`.
- Start from `config/test_create_database.yml` and `config/test_run.yml` when reproducing CI behavior.
Loading
Loading