Skip to content
Michael Watson edited this page Mar 26, 2026 · 5 revisions
 █████╗  ██████╗████████╗██╗   ██╗ █████╗ ███████╗██╗      ██████╗ ██╗    ██╗
██╔══██╗██╔════╝╚══██╔══╝██║   ██║██╔══██╗██╔════╝██║     ██╔═══██╗██║    ██║
███████║██║        ██║   ██║   ██║███████║█████╗  ██║     ██║   ██║██║ █╗ ██║
██╔══██║██║        ██║   ██║   ██║██╔══██║██╔══╝  ██║     ██║   ██║██║███╗██║
██║  ██║╚██████╗   ██║   ╚██████╔╝██║  ██║██║     ███████╗╚██████╔╝╚███╔███╔╝
╚═╝  ╚═╝ ╚═════╝   ╚═╝    ╚═════╝ ╚═╝  ╚═╝╚═╝     ╚══════╝ ╚═════╝  ╚══╝╚══╝

Non-life insurance pricing in Python — from raw data to deployed model.


What is ActuaFlow?

ActuaFlow is a Python package for non-life insurance pricing. The long-term goal is to cover the complete pricing workflow: loading and cleaning data, fitting frequency and severity GLMs, applying credibility weighting, building a tariff, optimising rates, producing governance documentation, and shipping a production-ready scoring bundle — all from one coherent API.

Most actuarial Python work today happens across scattered notebooks, bespoke scripts, and Excel spreadsheets stitched together with manual steps. ActuaFlow aims to replace that fragmentation with a package that is opinionated about the correct sequence of steps, enforces good practices automatically, and fits into an insurer or MGA's existing data pipeline rather than requiring one built around it.

The package is in early development. Only a subset of the intended functionality exists. This wiki is explicit throughout about what is implemented versus what is planned.


Why we built it

The immediate motivation was practical frustration. Building a pricing model in Python requires too much from-scratch work that every pricing team duplicates independently: implementing IRLS, writing credibility weighting per project, building lift curves and A/E plots that every actuary needs but which is not contained in one library, manually converting model coefficients into tariff relativities, and deploying by emailing a notebook to IT and waiting.

The longer-term motivation is structural. Regulators are increasingly expecting documented, auditable, reproducible pricing model development. Insurers and MGAs that treat pricing as a software engineering problem — with version control, automated testing, immutable audit trails, and proper CI/CD — will have a material advantage. ActuaFlow is the infrastructure that makes that achievable without each team building their own.


Current state

March 2026 — Phase 0 is complete. Phase 1 is roughly 40% done.

Phase What it covers Status
Phase 0 CI/CD, config, model save/load/compare ✅ Complete
Phase 1 Core GLMs, credibility, ZI models, diagnostics 🔄 In progress — 5 of 13 items done
Phase 2 Tariff builder, rate optimiser, GAM, compliance ⏳ Not started
Phase 3 REST API, plugins, ONNX, Docker, AutoML ⏳ Not started
Phase 4 Monitoring, audit log, governance, fairness ⏳ Not started
Phase 5 Rust engine, IBNR, spatial GLM, .aflow bundles ⏳ Not started
Phase 6 7 actuarial visualisation charts ⏳ Not started

Queued — not yet scheduled

  • PricingSession — an 8-step guided workflow object that calls everything in the correct sequence, auto-runs diagnostics at each step, and tells the actuary what to do next
  • AutoDocumentationGenerator — produces Word, PDF, HTML, or Markdown model documentation automatically from a fitted session; audience parameter selects management summary, regulator filing pack, or peer review pack
  • WhatIfSimulator — interactive ipywidgets dashboard with sliders for each rating factor, showing real-time premium impact, volume change, and projected loss ratio

Roadmap at a glance

Phase 0  ████████████████████  DONE          3 items
Phase 1  ████████░░░░░░░░░░░░  IN PROGRESS   5/13 items done
Phase 2  ░░░░░░░░░░░░░░░░░░░░  NEXT          10 items
Phase 3  ░░░░░░░░░░░░░░░░░░░░              11 items
Phase 4  ░░░░░░░░░░░░░░░░░░░░              12 items
Phase 5  ░░░░░░░░░░░░░░░░░░░░              13 items
Phase 6  ░░░░░░░░░░░░░░░░░░░░               7 items
         + 3 queued items

Design philosophy

Actuaries first. The API uses actuarial terminology throughout — frequency, severity, relativity, credibility, earned exposure, loss ratio, IBNR. The diagnostic plots are the ones that go into a model review pack, not a scikit-learn tutorial. Methods default to approaches that are defensible to regulators.

Honest about uncertainty. A thin territory should not get a precise estimate it has not earned. Gaussian Process spatial smoothing (Phase 5) produces a posterior uncertainty per location — policies in dense areas get precise estimates; policies in sparse areas are automatically credibility-weighted toward the mean. The width of that interval is the model's credibility score.

Production means production. A package that cannot be deployed is not a production tool. The audit log, governance lifecycle, fairness circuit breaker, and .aflow sidecar are first-class features because pricing models in regulated markets require all of them.

No hidden behaviour. When the Rust engine activates on a large dataset it produces identical outputs to the Python engine. Both paths are tested against each other in CI.


Contributing

git clone https://github.com/your-org/actuaflow
cd actuaflow
pip install -e ".[dev]"
pre-commit install
pytest

When contributing: add type hints and a docstring, cite the actuarial source for any statistical method (e.g. Mack 1993, ASOP No. 25), add at least one unit test. PRs run against Python 3.10, 3.11, and 3.12 on Ubuntu.

What exists today

File structure

This is what the repository actually looks like now. Directories marked ← planned contain only stubs or __init__.py files — they are scaffolded but not yet populated.

actuaflow/
│
├── glm/
│   ├── __init__.py
│   ├── models.py            # BaseGLM, FrequencyGLM, SeverityGLM
│   ├── models.pyi           # type stubs
│   └── zero_inflated.py     # ZeroInflatedPoisson, ZINB       ✅ P1-5
│
├── credibility/
│   ├── __init__.py
│   ├── buhlmann.py          # BuhlmannCredibility              ✅ P1-4
│   └── limited_fluctuation.py                                  ✅ P1-4
│
├── assumptions/
│   ├── __init__.py
│   ├── projection.py        # AssumptionProjector, InflationAdjuster  ✅ P1-2
│   └── assumption_helper.py # AssumptionHelper                 ✅ P1-3
│
├── explain/
│   ├── __init__.py
│   └── change_explainer.py  # ChangeExplainer                  ✅ P1-1
│
├── registry/
│   ├── __init__.py
│   └── model_store.py       # ModelStore (save/load/version/compare)  ✅ P0-3
│
├── config/
│   └── settings.py          # ActuaFlowSettings (pydantic)     ✅ P0-2
│
├── freqsev/
│   ├── __init__.py
│   ├── frequency.py         # existing base
│   ├── severity.py          # existing base
│   └── aggregate.py         # existing base
│
├── exposure/
│   ├── rating.py
│   └── trending.py
│
├── diagnostics/
│   └── diagnostics.py       # existing: lift curve, Gini, Cook's, leverage
│
├── portfolio/
│   └── impact.py
│
├── utils/
│   └── data_loading.py
│
├── __init__.py
│
│   ── Not yet implemented (scaffolded only) ──
│
├── tariff/          ← Phase 2
├── optimization/    ← Phase 2
├── data/            ← Phase 2
├── compliance/      ← Phase 2
├── ml/              ← Phase 3
├── deployment/      ← Phase 3 / 5
├── monitoring/      ← Phase 4
├── audit/           ← Phase 4
├── governance/      ← Phase 4
├── security/        ← Phase 4
├── geo/             ← Phase 5
├── reserving/       ← Phase 5
├── market/          ← Phase 5
├── aggregation/     ← Phase 5
├── simulation/      ← Phase 5
└── actuaflow_core/  ← Phase 5 (Rust extension — not yet built)

.github/workflows/
├── ci.yml           # test matrix Python 3.10/3.11/3.12       ✅ P0-1
├── release.yml      # PyPI publish via maturin                 ✅ P0-1
└── security.yml     # bandit, safety, semgrep (weekly)         ✅ P0-1

.pre-commit-config.yaml  # black, ruff, mypy                    ✅ P0-1
pyproject.toml
docs/
tests/

Working examples

Fitting a frequency model

from actuaflow.glm.models import FrequencyGLM

freq = FrequencyGLM(family="poisson", link="log")
freq.fit(data, "n_claims ~ age_group + vehicle_age + ncd_years",
         offset="exposure")

freq.predict(new_data)
freq.diagnostics()   # returns aic, bic, deviance, converged, dispersion

Versioning and comparing models

from actuaflow.registry.model_store import ModelStore

freq.save("motor_freq", tag="2026_q2", reason="Q2 refresh")

store = ModelStore()
store.list_versions("motor_freq")
# tag        timestamp            aic
# 2026_q2    2026-03-15 14:22     47802
# 2026_q1    2025-12-01 09:14     48231

store.compare("motor_freq", "2026_q1", "2026_q2")
# param               coef_a    coef_b    relativity_a    relativity_b
# age_group[18-25]    0.418     0.476     1.52            1.61

Explaining what changed between versions

from actuaflow.explain.change_explainer import ChangeExplainer

old = FrequencyGLM.load("motor_freq", tag="2026_q1")
new = FrequencyGLM.load("motor_freq", tag="2026_q2")

explainer = ChangeExplainer(old, new)
print(explainer.plain_english_summary())
# AIC improved 48,231 -> 47,802 (-429). Top relativity changes:
#   age_group[18-25]: 1.52 -> 1.61 (+5.9%)
#   ncd_years[0]: 1.34 -> 1.29 (-3.7%)
# Average premium impact across portfolio: +2.1%

explainer.coefficient_changes()       # full DataFrame
explainer.premium_impact(data)        # mean/median change, % up/down/flat
explainer.plot_coefficient_changes()  # horizontal bar chart

Credibility weighting

from actuaflow.credibility.buhlmann import BuhlmannCredibility

buhmann = BuhlmannCredibility()
buhmann.fit(data, group_col="territory",
            exposure_col="exposure", loss_col="n_claims")
buhmann.blend(experience_rate=0.082, exposure=450, manual_rate=0.071)
buhmann.summary()   # K, EPV, VHM, grand mean, Z per group

Zero-inflated models

from actuaflow.glm.zero_inflated import ZeroInflatedPoisson

model = ZeroInflatedPoisson()
model.fit(data,
          count_formula="n_claims ~ age_group + vehicle_age",
          inflation_formula="ncd_years",
          offset="exposure")
model.vuong_test(data)     # V-stat and p-value vs plain Poisson
model.diagnostics()        # includes pct_structural_zeros

Assumption projection

from actuaflow.assumptions.projection import AssumptionProjector

proj = AssumptionProjector(base_year=2023, target_year=2026)
proj.add_trend("claim_severity", annual_rate=0.05,
               method="exponential", comment="CPI trend")
projected = proj.apply_all(data, date_col="accident_date")
proj.assumption_table()    # all trends in one DataFrame

Where it is going

Everything below is planned — not yet built.

Completing Phase 1 (items 6–13, in progress)

P1-6 Hurdle modelsHurdleGamma and HurdleLognormal. Two-part models: a binomial GLM predicts whether a claim occurs, a second GLM predicts severity given a claim. Useful when zero and non-zero observations have structurally different drivers.

P1-7 Large loss treatmentLargeLossHandler. Caps extreme claims using a Generalised Pareto Distribution fitted to excesses above a threshold. Produces a large_loss_load for use in aggregate pricing.

P1-8 Exposure curvesExposureCurve. Adjusts for partial-year exposure using empirical polynomial regression or the actuarial parallelogram method.

P1-9 Actuarial diagnosticsActuarialDiagnostics. The standard battery of checks that go into every model review: actual-vs-expected by segment, double lift chart, minimum bias, claim count distribution comparison, loss ratio validation, coefficient stability over time.

P1-10 to P1-13 — customer lifetime value (CustomerLifetimeValue), stratified and accident year CV splits, a complete type system with .pyi stubs and retry decorators, and the strategy pattern with an event bus that later phases depend on.

Phase 2 — Rating engine

The centrepiece is the tariff builder. build_tariff_from_glm() takes a fitted model and produces a fully versioned, auditable TariffPlan in one call — no manual relativity extraction. Alongside it: a constrained rate optimiser using linear programming with credibility-weighted blending, GAMs with B-spline and P-spline smoothing, monotonicity-constrained GLMs, quantile regression for severity, an automated data quality and preprocessing pipeline, and a complete regulatory compliance package with SERFF XML export.

Phase 3 — Making it deployable

A FastAPI REST server so any system can call the model over HTTP. A plugin architecture for custom model types. Async fitting to run multiple models concurrently. XGBoost integration with multiplicative SHAP (relativities in actuarial space, not log space). ONNX export for vendor-neutral deployment. A/B testing framework. AutoML for automatic family and link function selection. Docker and Conda packaging. Synthetic motor and property datasets for reproducible testing.

Phase 4 — Monitoring and governance

Once a model is in production it needs to be watched. This phase adds:

  • Four-layer monitoring stack — input drift (PSI, Wasserstein), adverse selection signals (bind rate, lapse rate), virtual performance (shadow models, out-of-distribution flagging), and rolling A/E with IBNR development
  • Immutable audit log — every model fit, rate change, and deployment is hash-chained. The chain verifies integrity and is embedded in the .aflow bundle
  • Model governance — development → peer review → approved → production → retired lifecycle with a structured peer review checklist
  • Fairness circuit breaker — four-fifths test, premium disparity test, proxy discrimination check. Blocks promotion to production if tests fail

Also in this phase: MLFlow registry integration, PII detection and masking, OpenTelemetry and Prometheus observability, fluent interface and builder pattern for ergonomic model construction.

Phase 5 — Advanced architecture

Two large items anchor this phase:

Full Rust engine. Every computational module rewritten in Rust — all GLM families, ZI EM algorithm, hurdle models, splines, credibility, chain ladder, GPD fitting, rate optimisation, copula simulation, diagnostics, monitoring. The Rust engine runs alongside the Python engine and activates automatically above a configurable row threshold (default 50,000 rows). Same API and identical outputs either way.

The .aflow production cycle. A binary bundle format that packages a model with its preprocessing pipeline, monitoring thresholds, and full audit chain into a single tamper-evident file. A standalone Rust sidecar (~2 MB) that IT deploys once. To deploy a new model version, the actuary drops a .aflow file into a watched directory — the sidecar verifies the hash, hot-swaps with zero downtime, and auto-rolls back if the rolling A/E ratio exceeds the threshold baked into the bundle.

Development     →    Export bundle    →    Drop file into    →    Rust sidecar scores
PricingSession       builder.validate()    /var/actuaflow/        via HTTP, emits
.fit()               parity check:         bundles/               telemetry, auto-
.optimise()          Python == Rust                               rolls back if
                                                                  A/E drifts

Also in Phase 5: Polars data layer, IBNR engine (chain ladder, ODP bootstrap, BF, Cape Cod), Gaussian Process spatial modelling for territory analysis with posterior uncertainty per location, competitive intelligence and demand modelling, sub-peril aggregation with cost of capital, portfolio flight simulator, climate risk enrichment, adversarial stress testing.

Phase 6 — Visualisation

Seven actuarial-specific charts absent from general data science packages:

Chart What it shows
Double lift Two models vs actual by decile, sorted by their ratio
Coefficient relativity exp(coef) with confidence intervals, one panel per variable
A/E by segment Model bias per rating factor with RAG colour coding
Claim count distribution Actual vs predicted count shape (0, 1, 2, 3, 4+)
Premium waterfall Build-up from pure premium through loadings to gross
Calibration curve Absolute calibration check, not just rank-ordering
4-panel monitoring dashboard Monthly governance report: A/E, drift, behavioural, stability

Queued — not yet scheduled

  • PricingSession — an 8-step guided workflow object that calls everything in the correct sequence, auto-runs diagnostics at each step, and tells the actuary what to do next
  • AutoDocumentationGenerator — produces Word, PDF, HTML, or Markdown model documentation automatically from a fitted session; audience parameter selects management summary, regulator filing pack, or peer review pack
  • WhatIfSimulator — interactive ipywidgets dashboard with sliders for each rating factor, showing real-time premium impact, volume change, and projected loss ratio

ActuaFlow is maintained by its contributors. It is not affiliated with any actuarial standards body.