Skip to content

ShrishDhuria/sfdr2-classification-engine

Repository files navigation

SFDR 2.0 Portfolio Classification Engine

tests

A transparent implementation of the EU SFDR 2.0 product-classification logic (Sustainable / Transition / ESG Basics) proposed by the European Commission on 20 November 2025, including the Paris-Aligned Benchmark (PAB) exclusion screens, the 70% category floor, and the 15%-Taxonomy and PAB-replication deeming provisions.

What it demonstrates

Every classification is gated → thresholded → and fully audited: PAB exclusion screens first, then the 70% category floor (with the 15%-Taxonomy and PAB-replication deeming provisions), and the decarbonisation pathway as a diagnostic, not a gate. On the bundled sample portfolio — deliberately constructed so every screen fires at least once — no category clears the 70% floor, so the book is Non-classifiable, and post-exclusion carbon intensity sits just above the 1.5°C pathway target: a compact demonstration that exclusions alone do not put a portfolio on a pathway.

SFDR 2.0 category test and pathway diagnostic

Left: qualifying weight per category against the 70% floor, and the resulting label. Right: the pathway diagnostic — exclusions cut weighted intensity from 149 to 56 tCO2e/EURm, still above the 52 target, so the book is flagged "reweighting required" rather than disqualified. Regenerate with python make_figures.py.

Why this project

SFDR 2.0 replaces the Article 8/9 disclosure regime with three labelled product categories and provides no grandfathering — every existing fund must requalify. Classification is a gate-then-threshold structure, not an ESG score: a portfolio either clears a sequence of binary exclusion gates and a quantitative floor, or it cannot use the label at all. This engine implements that logic with a full per-decision audit trail.

Project structure

sfdr2-classification-engine/
├── sfdr2/
│   ├── config.py       # All regulatory parameters (thresholds, pathway, proxies)
│   ├── data.py         # Phase 1 data contract + synthetic sample portfolio
│   ├── exclusions.py   # Phase 2: PAB exclusion screening (norms + activity)
│   ├── alignment.py    # Phase 3: Taxonomy alignment, 70% test, decarb pathway
│   └── classify.py     # Phase 4: classification decision-tree orchestrator
│   └── sensitivity.py  # Phase 5: cost-of-compliance & robustness analysis
├── phase7/
│   ├── bl_market_data.py  # Bridge: BL data layer (real SX5E returns + LW cov)
│   ├── esg_data.py        # Phase 7a: real universe + sector-archetype ESG proxies
│   └── esg_optimiser.py   # Phase 7b/c/d: min-TE optimiser, scenarios, frontier
├── run_phase2.py       # Exclusion screening report
├── run_phase3.py       # Alignment & threshold report
├── run_phase4.py       # Full classification + PAB-replication counterfactual
├── run_phase5.py       # Cost-of-compliance & robustness analysis
├── run_phase7.py       # ESG-tilted optimiser report + frontier chart
├── dashboard.py        # Phase 6: interactive Streamlit dashboard (classification)
├── dashboard_phase7.py # Phase 7: interactive ESG-tilted optimiser dashboard
├── build_workbook.js   # Phase 6: generates sfdr2_screening.xlsx (ExcelJS, JS)
├── build_deck.js       # Phase 6: generates sfdr2_methodology.pptx (pptxgenjs, JS)
├── package.json        # JS deps + npm scripts for the Office generators
└── requirements.txt

Delivery layer (Phase 6)

Three artefacts. The two Office files are built natively in JavaScript (no Python) and are standalone — the verified figures are embedded, so they open and present without running the engine:

  • dashboard.py — live Streamlit app with interactive PAB-replication toggle, year slider, and Taxonomy multiplier. Run: streamlit run dashboard.py
  • sfdr2_screening.xlsx — formula-driven workbook built with ExcelJS. Cover, Parameters, Issuers (formula breach columns with conditional formatting), Summary (cross-sheet formulas), and Findings tabs. Change a parameter on the Parameters tab and every screen recomputes. Regenerate with npm run workbook (or node build_workbook.js).
  • sfdr2_methodology.pptx — 9-slide deck built with pptxgenjs, with native (editable) PowerPoint charts, the decision tree, the four findings, and the synthesis. Regenerate with npm run deck (or node build_deck.js).

To rebuild both Office files: npm install && npm run build.

Phase 7 — ESG-tilted optimiser (bridge to Black-Litterman)

Phase 7 closes the decarbonisation residual the SFDR engine diagnoses. It finds the minimum-tracking-error tilt of the SX5E benchmark that (a) drops PAB-excluded names to zero weight, (b) drives weighted-average carbon intensity to the 1.5°C pathway target, under (c) a long-only, full-investment, UCITS-style per-name cap.

The bridge. phase7/esg_data.py imports the real SX5E universe and the real return/covariance bundle from the Black-Litterman project's market_data.py (vendored here as phase7/bl_market_data.py). The covariance is the same yfinance + Ledoit-Wolf estimate the BL engine uses, so Σ is methodologically identical across the two projects. To point at your live BL project instead, replace phase7/bl_market_data.py or edit the import in esg_data.py — they are the same module.

Data honesty. Returns and market-cap weights are real public market data. ESG attributes are transparent sector-archetype proxies (an energy-major archetype, a clean-utility archetype, etc.), documented placeholders for public-source data (Urgewald GCEL/GOGEL, Norges Bank/GPFG exclusion list, CSRD / Taxonomy Article 8 disclosures). They are not audited ESG figures for the named issuers.

Why minimum tracking error, not max Sharpe. In a real ESG mandate the PM has already committed to the label; the task is to deliver compliance at the lowest cost to the benchmark-relative profile. Exclusions are encoded as hard zeros (a regulatory bright line is binary, not a penalty); the carbon constraint is linear in weights, keeping the program a convex QP solved with SLSQP — the same solver and conventions as the BL optimiser.

Key finding — the cost of green. On the SX5E universe, reaching the 1.5°C pathway (a ~68% intensity cut vs the parent benchmark) costs roughly 335 bps of annual tracking error under a 12% name cap. The cost-of-green frontier (cost_of_green_frontier.png) is convex: each additional tonne of intensity removed costs more active risk than the last. Because only ~11% of the benchmark is PAB-excluded and qualifying weight stays above 70%, the three compliance paths (pure 70%, 15%-Taxonomy deem, PAB replication) coincide on this universe — itself a finding: for a large-cap European benchmark, the binding constraint is carbon intensity, not the qualifying-weight floor.

Run: python3 run_phase7.py (console report + frontier chart) or streamlit run dashboard_phase7.py (interactive).

Key finding (Phase 5)

On the sample portfolio, SFDR 2.0 classification is driven more by product structure and a single deeming threshold than by portfolio composition:

  • The PAB exclusion gate destroys 45% of qualifying weight, attributed across five distinct screens (no single screen dominates) — so a portfolio of individually-clean names is still non-classifiable.
  • Classification flips from Non-classifiable to Sustainable at just 1.40× the current Taxonomy-aligned revenue via the 15% deem — a tipping point well inside the noise band of Taxonomy estimates.
  • Once the product is structured to replicate a PAB, the label is completely insensitive to the internal qualification proxies (Sustainable/Transition thresholds, even the 70% floor): the deeming provision overrides them all.
  • Exclusion screening alone leaves the portfolio 14.4% above the 1.5°C pathway — screening is not decarbonisation.

The practitioner implication: the SFDR 2.0 label is a weak signal of a portfolio's actual sustainability; it is largely determined by product-design choices and one estimated threshold. This is the project's research output, not a defect of the engine.

How to run

Requires Python 3.8+ only — no third-party dependencies in Phases 1–4.

python3 run_phase2.py    # PAB exclusion screening
python3 run_phase3.py    # Taxonomy alignment & the 70% test
python3 run_phase4.py    # Full classification engine (2 scenarios)
python3 run_phase5.py    # Cost-of-compliance & robustness analysis
python3 run_phase7.py    # ESG-tilted optimiser + cost-of-green frontier (needs network)

Note: Phases 2–5 use the standard library only. Phase 7 requires the full requirements.txt (yfinance, scikit-learn, scipy, matplotlib) and live network access for the SX5E data pull.

Methodology notes

  • Thresholds are externalised in config.py so the regulatory parameters are auditable in one place and sensitivity analysis is a config change, not a logic change. Source: Commission Delegated Regulation (EU) 2020/1818 (PAB minimum standards).
  • Data gaps are conservative. A missing input is never treated as a zero or as alignment; the issuer is flagged for review and never counts as qualifying, but remains an "investment" in the denominator. This reflects SFDR 2.0's estimation-discipline and documentation-on-request obligations.
  • Synthetic sample data. data.py uses synthetic sector archetypes, not real index constituents with invented ESG figures. A later phase wires public proxy sources (Urgewald Global Coal Exit List / Global Oil & Gas Exit List, Norges Bank / GPFG exclusion list, company CSRD / Taxonomy Article 8 disclosures).
  • Pathway alignment is a diagnostic, not a gate. The category test is the exclusions + threshold structure; the PAB 1.5°C decarbonisation pathway is a benchmark-construction standard reported as a warning, not a classification gate. Conflating the two is a common error.

Roadmap

  • Phase 5 — Reclassification & cost-of-compliance sensitivity analysis
  • Phase 6 — Streamlit dashboard, parallel Excel screening workbook, python-pptx methodology deck
  • Phase 7 — ESG-tilted optimiser closing the decarbonisation residual (bridge to the Black-Litterman engine)

Limitations

Important caveats for a real ESG/regulatory context:

  • It implements a proposal. The 20 November 2025 EC text may change before it becomes law; every threshold is externalised in config.py precisely so a rule change is a config edit, not a logic change.
  • Category qualification uses documented proxies. Taxonomy-aligned revenue/capex and carbon-intensity proxies stand in for the full substantial-contribution / DNSH / minimum-safeguards test, which is not reconstructable from public data.
  • Parent-benchmark intensity is proxied by the pre-screen weighted average, not an official index level.
  • Data gaps bound the result. Missing inputs are treated conservatively (never a silent pass), which in practice parks names in lower tiers — so classification quality is capped by issuer-data coverage.
  • Screens depend on third-party flags (e.g. Urgewald GCEL/GOGEL, Norges Bank exclusions) whose coverage and update cadence vary.

Testing

A pytest suite under tests/ exercises the gate-then-threshold logic as boundary and decision tests rather than numeric-tolerance checks.

  • Exclusion gates — norms screens fire on a flag; activity thresholds are inclusive at the boundary (oil ≥ 10%, coal ≥ 1%); the power-generation carve-out requires both conditions; a missing input is a conservative data gap, never a silent pass.
  • 70% category floor — binds exactly at the boundary (isolated from the 15% deeming path) and drops the label to the next-strictest tier just below it.
  • Deeming — ≥ 15% Taxonomy-aligned revenue lifts a sub-floor portfolio to Sustainable via the deem path.
  • Decision tree — the strictest passing category wins; the decarbonisation pathway is a diagnostic, not a gate (a high-carbon but floor-clearing book stays classifiable, with a warning); every result carries a non-empty audit trail; exclusions that shrink the base below all floors yield "Non-classifiable".
pip install -r requirements-dev.txt
pytest tests/ -q          # 10 tests

Tests run automatically on every push via GitHub Actions (.github/workflows/tests.yml).

About

EU SFDR 2.0 product-classification engine (Sustainable / Transition / ESG Basics): PAB exclusion screens, 70% floor, Taxonomy & PAB deeming provisions, decarbonisation-pathway diagnostic, full per-decision audit trail.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors