SFDR 2.0 Portfolio Classification Engine

A transparent implementation of the EU SFDR 2.0 product-classification logic (Sustainable / Transition / ESG Basics) proposed by the European Commission on 20 November 2025, including the Paris-Aligned Benchmark (PAB) exclusion screens, the 70% category floor, and the 15%-Taxonomy and PAB-replication deeming provisions.

What it demonstrates

Every classification is gated → thresholded → and fully audited: PAB exclusion screens first, then the 70% category floor (with the 15%-Taxonomy and PAB-replication deeming provisions), and the decarbonisation pathway as a diagnostic, not a gate. On the bundled sample portfolio — deliberately constructed so every screen fires at least once — no category clears the 70% floor, so the book is Non-classifiable, and post-exclusion carbon intensity sits just above the 1.5°C pathway target: a compact demonstration that exclusions alone do not put a portfolio on a pathway.

Left: qualifying weight per category against the 70% floor, and the resulting label. Right: the pathway diagnostic — exclusions cut weighted intensity from 149 to 56 tCO2e/EURm, still above the 52 target, so the book is flagged "reweighting required" rather than disqualified. Regenerate with python make_figures.py.

Why this project

SFDR 2.0 replaces the Article 8/9 disclosure regime with three labelled product categories and provides no grandfathering — every existing fund must requalify. Classification is a gate-then-threshold structure, not an ESG score: a portfolio either clears a sequence of binary exclusion gates and a quantitative floor, or it cannot use the label at all. This engine implements that logic with a full per-decision audit trail.

Project structure

sfdr2-classification-engine/
├── sfdr2/
│   ├── config.py       # All regulatory parameters (thresholds, pathway, proxies)
│   ├── data.py         # Phase 1 data contract + synthetic sample portfolio
│   ├── exclusions.py   # Phase 2: PAB exclusion screening (norms + activity)
│   ├── alignment.py    # Phase 3: Taxonomy alignment, 70% test, decarb pathway
│   └── classify.py     # Phase 4: classification decision-tree orchestrator
│   └── sensitivity.py  # Phase 5: cost-of-compliance & robustness analysis
├── phase7/
│   ├── bl_market_data.py  # Bridge: BL data layer (real SX5E returns + LW cov)
│   ├── esg_data.py        # Phase 7a: real universe + sector-archetype ESG proxies
│   └── esg_optimiser.py   # Phase 7b/c/d: min-TE optimiser, scenarios, frontier
├── run_phase2.py       # Exclusion screening report
├── run_phase3.py       # Alignment & threshold report
├── run_phase4.py       # Full classification + PAB-replication counterfactual
├── run_phase5.py       # Cost-of-compliance & robustness analysis
├── run_phase7.py       # ESG-tilted optimiser report + frontier chart
├── dashboard.py        # Phase 6: interactive Streamlit dashboard (classification)
├── dashboard_phase7.py # Phase 7: interactive ESG-tilted optimiser dashboard
├── build_workbook.js   # Phase 6: generates sfdr2_screening.xlsx (ExcelJS, JS)
├── build_deck.js       # Phase 6: generates sfdr2_methodology.pptx (pptxgenjs, JS)
├── package.json        # JS deps + npm scripts for the Office generators
└── requirements.txt

Delivery layer (Phase 6)

Three artefacts. The two Office files are built natively in JavaScript (no Python) and are standalone — the verified figures are embedded, so they open and present without running the engine:

dashboard.py — live Streamlit app with interactive PAB-replication toggle, year slider, and Taxonomy multiplier. Run: streamlit run dashboard.py
sfdr2_screening.xlsx — formula-driven workbook built with ExcelJS. Cover, Parameters, Issuers (formula breach columns with conditional formatting), Summary (cross-sheet formulas), and Findings tabs. Change a parameter on the Parameters tab and every screen recomputes. Regenerate with npm run workbook (or node build_workbook.js).
sfdr2_methodology.pptx — 9-slide deck built with pptxgenjs, with native (editable) PowerPoint charts, the decision tree, the four findings, and the synthesis. Regenerate with npm run deck (or node build_deck.js).

To rebuild both Office files: npm install && npm run build.

Phase 7 — ESG-tilted optimiser (bridge to Black-Litterman)

Phase 7 closes the decarbonisation residual the SFDR engine diagnoses. It finds the minimum-tracking-error tilt of the SX5E benchmark that (a) drops PAB-excluded names to zero weight, (b) drives weighted-average carbon intensity to the 1.5°C pathway target, under (c) a long-only, full-investment, UCITS-style per-name cap.

The bridge. phase7/esg_data.py imports the real SX5E universe and the real return/covariance bundle from the Black-Litterman project's market_data.py (vendored here as phase7/bl_market_data.py). The covariance is the same yfinance + Ledoit-Wolf estimate the BL engine uses, so Σ is methodologically identical across the two projects. To point at your live BL project instead, replace phase7/bl_market_data.py or edit the import in esg_data.py — they are the same module.

Data honesty. Returns and market-cap weights are real public market data. ESG attributes are transparent sector-archetype proxies (an energy-major archetype, a clean-utility archetype, etc.), documented placeholders for public-source data (Urgewald GCEL/GOGEL, Norges Bank/GPFG exclusion list, CSRD / Taxonomy Article 8 disclosures). They are not audited ESG figures for the named issuers.

Why minimum tracking error, not max Sharpe. In a real ESG mandate the PM has already committed to the label; the task is to deliver compliance at the lowest cost to the benchmark-relative profile. Exclusions are encoded as hard zeros (a regulatory bright line is binary, not a penalty); the carbon constraint is linear in weights, keeping the program a convex QP solved with SLSQP — the same solver and conventions as the BL optimiser.

Key finding — the cost of green. On the SX5E universe, reaching the 1.5°C pathway (a ~68% intensity cut vs the parent benchmark) costs roughly 335 bps of annual tracking error under a 12% name cap. The cost-of-green frontier (cost_of_green_frontier.png) is convex: each additional tonne of intensity removed costs more active risk than the last. Because only ~11% of the benchmark is PAB-excluded and qualifying weight stays above 70%, the three compliance paths (pure 70%, 15%-Taxonomy deem, PAB replication) coincide on this universe — itself a finding: for a large-cap European benchmark, the binding constraint is carbon intensity, not the qualifying-weight floor.

Run: python3 run_phase7.py (console report + frontier chart) or streamlit run dashboard_phase7.py (interactive).

Key finding (Phase 5)

On the sample portfolio, SFDR 2.0 classification is driven more by product structure and a single deeming threshold than by portfolio composition:

The PAB exclusion gate destroys 45% of qualifying weight, attributed across five distinct screens (no single screen dominates) — so a portfolio of individually-clean names is still non-classifiable.
Classification flips from Non-classifiable to Sustainable at just 1.40× the current Taxonomy-aligned revenue via the 15% deem — a tipping point well inside the noise band of Taxonomy estimates.
Once the product is structured to replicate a PAB, the label is completely insensitive to the internal qualification proxies (Sustainable/Transition thresholds, even the 70% floor): the deeming provision overrides them all.
Exclusion screening alone leaves the portfolio 14.4% above the 1.5°C pathway — screening is not decarbonisation.

The practitioner implication: the SFDR 2.0 label is a weak signal of a portfolio's actual sustainability; it is largely determined by product-design choices and one estimated threshold. This is the project's research output, not a defect of the engine.

How to run

Requires Python 3.8+ only — no third-party dependencies in Phases 1–4.

python3 run_phase2.py    # PAB exclusion screening
python3 run_phase3.py    # Taxonomy alignment & the 70% test
python3 run_phase4.py    # Full classification engine (2 scenarios)
python3 run_phase5.py    # Cost-of-compliance & robustness analysis
python3 run_phase7.py    # ESG-tilted optimiser + cost-of-green frontier (needs network)

Note: Phases 2–5 use the standard library only. Phase 7 requires the full requirements.txt (yfinance, scikit-learn, scipy, matplotlib) and live network access for the SX5E data pull.

Methodology notes

Thresholds are externalised in config.py so the regulatory parameters are auditable in one place and sensitivity analysis is a config change, not a logic change. Source: Commission Delegated Regulation (EU) 2020/1818 (PAB minimum standards).
Data gaps are conservative. A missing input is never treated as a zero or as alignment; the issuer is flagged for review and never counts as qualifying, but remains an "investment" in the denominator. This reflects SFDR 2.0's estimation-discipline and documentation-on-request obligations.
Synthetic sample data. data.py uses synthetic sector archetypes, not real index constituents with invented ESG figures. A later phase wires public proxy sources (Urgewald Global Coal Exit List / Global Oil & Gas Exit List, Norges Bank / GPFG exclusion list, company CSRD / Taxonomy Article 8 disclosures).
Pathway alignment is a diagnostic, not a gate. The category test is the exclusions + threshold structure; the PAB 1.5°C decarbonisation pathway is a benchmark-construction standard reported as a warning, not a classification gate. Conflating the two is a common error.

Roadmap

~~Phase 5 — Reclassification & cost-of-compliance sensitivity analysis~~ ✓
~~Phase 6 — Streamlit dashboard, parallel Excel screening workbook, python-pptx methodology deck~~ ✓
~~Phase 7 — ESG-tilted optimiser closing the decarbonisation residual (bridge to the Black-Litterman engine)~~ ✓

Limitations

Important caveats for a real ESG/regulatory context:

It implements a proposal. The 20 November 2025 EC text may change before it becomes law; every threshold is externalised in config.py precisely so a rule change is a config edit, not a logic change.
Category qualification uses documented proxies. Taxonomy-aligned revenue/capex and carbon-intensity proxies stand in for the full substantial-contribution / DNSH / minimum-safeguards test, which is not reconstructable from public data.
Parent-benchmark intensity is proxied by the pre-screen weighted average, not an official index level.
Data gaps bound the result. Missing inputs are treated conservatively (never a silent pass), which in practice parks names in lower tiers — so classification quality is capped by issuer-data coverage.
Screens depend on third-party flags (e.g. Urgewald GCEL/GOGEL, Norges Bank exclusions) whose coverage and update cadence vary.

Testing

A pytest suite under tests/ exercises the gate-then-threshold logic as boundary and decision tests rather than numeric-tolerance checks.

Exclusion gates — norms screens fire on a flag; activity thresholds are inclusive at the boundary (oil ≥ 10%, coal ≥ 1%); the power-generation carve-out requires both conditions; a missing input is a conservative data gap, never a silent pass.
70% category floor — binds exactly at the boundary (isolated from the 15% deeming path) and drops the label to the next-strictest tier just below it.
Deeming — ≥ 15% Taxonomy-aligned revenue lifts a sub-floor portfolio to Sustainable via the deem path.
Decision tree — the strictest passing category wins; the decarbonisation pathway is a diagnostic, not a gate (a high-carbon but floor-clearing book stays classifiable, with a warning); every result carries a non-empty audit trail; exclusions that shrink the base below all floors yield "Non-classifiable".

pip install -r requirements-dev.txt
pytest tests/ -q          # 10 tests

Tests run automatically on every push via GitHub Actions (.github/workflows/tests.yml).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SFDR 2.0 Portfolio Classification Engine

What it demonstrates

Why this project

Project structure

Delivery layer (Phase 6)

Phase 7 — ESG-tilted optimiser (bridge to Black-Litterman)

Key finding (Phase 5)

How to run

Methodology notes

Roadmap

Limitations

Testing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
docs		docs
phase7		phase7
sfdr2		sfdr2
tests		tests
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
cost_of_green_frontier.png		cost_of_green_frontier.png
dashboard.py		dashboard.py
dashboard_phase7.py		dashboard_phase7.py
make_figures.py		make_figures.py
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
run_phase2.py		run_phase2.py
run_phase3.py		run_phase3.py
run_phase4.py		run_phase4.py
run_phase5.py		run_phase5.py
run_phase7.py		run_phase7.py
sfdr2_methodology.pptx		sfdr2_methodology.pptx
sfdr2_screening.xlsx		sfdr2_screening.xlsx

Folders and files

Latest commit

History

Repository files navigation

SFDR 2.0 Portfolio Classification Engine

What it demonstrates

Why this project

Project structure

Delivery layer (Phase 6)

Phase 7 — ESG-tilted optimiser (bridge to Black-Litterman)

Key finding (Phase 5)

How to run

Methodology notes

Roadmap

Limitations

Testing

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages