Eora26 unit metadata is incorrect: data is in '000 USD, not Mill USD
Summary
pymrio.parse_eora26() sets unit to "Mill USD" based on personal communication, but empirical verification against known GDP benchmarks shows the data is actually in thousands of USD ('000 USD), consistent with the Eora FAQ.
Reproduction
Tested with Eora26 basic price tables for 2017.
import pymrio
eora = pymrio.parse_eora26(year=2017, path="Eora26_2017_bp.zip")
eora.calc_all()
# Check what unit pymrio assigns
print(f"pymrio unit metadata: {eora.unit.iloc[0, 0]}")
# prints: "Mill USD"
# US total output
us_x = eora.x.loc[("USA", slice(None)), :].sum().sum()
print(f"\nRaw US total output (x): {us_x:,.0f}")
# US total final demand (column sums for USA)
us_y_cols = [c for c in eora.Y.columns if c[0] == "USA"]
us_final_demand = eora.Y[us_y_cols].sum().sum()
print(f"Raw US total final demand (Y): {us_final_demand:,.0f}")
# US value added = total output - intermediate inputs
us_z_cols = [c for c in eora.Z.columns if c[0] == "USA"]
us_intermediate = eora.Z[us_z_cols].sum().sum()
us_va = us_x - us_intermediate
print(f"Raw US value added (x - Z col sums): {us_va:,.0f}")
print(f" Interpreted as '000 USD: ${us_va / 1e9:.1f} trillion")
print(f" Interpreted as Mill USD: ${us_va / 1e6:.1f} trillion")
# World value added
world_x = eora.x.sum().sum()
world_intermediate = eora.Z.sum().sum()
world_va = world_x - world_intermediate
print(f"\nRaw world value added: {world_va:,.0f}")
print(f" Interpreted as '000 USD: ${world_va / 1e9:.1f} trillion")
print(f" Interpreted as Mill USD: ${world_va / 1e6:.1f} trillion")
Output
pymrio unit metadata: Mill USD
Raw US total output (x): 33,068,009,892
Raw US total final demand (Y): 19,768,436,862
Raw US value added (x - Z col sums): 19,061,267,619
Interpreted as '000 USD: $19.1 trillion
Interpreted as Mill USD: $19061.3 trillion
Raw world value added: 75,801,582,002
Interpreted as '000 USD: $75.8 trillion
Interpreted as Mill USD: $75801.6 trillion
Interpretation against benchmark
US GDP in 2017 was approximately $19.5 trillion (World Bank). Total value added should be slightly below GDP.
| Interpretation |
US VA |
World VA |
| '000 USD (thousands) |
$19.1 trillion |
$75.8 trillion |
| Mill USD (millions) |
$19,061 trillion |
$75,802 trillion |
The thousands interpretation matches the expected magnitude. The millions interpretation is off by a factor of 1,000.
Where the incorrect unit is set
In pymrio/tools/ioparser.py, the parse_eora26 function contains:
# bandwidth is set to 1 for eora
# based on personal communication, unit is set to Mill USD
io.unit = pd.DataFrame(
data=["Mill USD"] * len(io.get_sectors()),
index=io.get_sectors(),
columns=["unit"],
)
This should be corrected to "000 USD" (thousands of USD) to match the actual data and the Eora FAQ, which states:
All monetary data are in '000 USD (thousands of US Dollars).
Environment
- pymrio version:
0.5.4
- Python 3.12
- Eora26 2017 basic price tables (
Eora26_2017_bp.zip)
Eora26 unit metadata is incorrect: data is in '000 USD, not Mill USD
Summary
pymrio.parse_eora26()setsunitto"Mill USD"based on personal communication, but empirical verification against known GDP benchmarks shows the data is actually in thousands of USD ('000 USD), consistent with the Eora FAQ.Reproduction
Tested with Eora26 basic price tables for 2017.
Output
Interpretation against benchmark
US GDP in 2017 was approximately $19.5 trillion (World Bank). Total value added should be slightly below GDP.
The thousands interpretation matches the expected magnitude. The millions interpretation is off by a factor of 1,000.
Where the incorrect unit is set
In
pymrio/tools/ioparser.py, theparse_eora26function contains:This should be corrected to
"000 USD"(thousands of USD) to match the actual data and the Eora FAQ, which states:Environment
0.5.4Eora26_2017_bp.zip)