Skip to content

Eora26 unit metadata is incorrect: data is in '000 USD, not Mill USD #175

@graebnerc

Description

@graebnerc

Eora26 unit metadata is incorrect: data is in '000 USD, not Mill USD

Summary

pymrio.parse_eora26() sets unit to "Mill USD" based on personal communication, but empirical verification against known GDP benchmarks shows the data is actually in thousands of USD ('000 USD), consistent with the Eora FAQ.

Reproduction

Tested with Eora26 basic price tables for 2017.

import pymrio

eora = pymrio.parse_eora26(year=2017, path="Eora26_2017_bp.zip")
eora.calc_all()

# Check what unit pymrio assigns
print(f"pymrio unit metadata: {eora.unit.iloc[0, 0]}")
# prints: "Mill USD"

# US total output
us_x = eora.x.loc[("USA", slice(None)), :].sum().sum()
print(f"\nRaw US total output (x): {us_x:,.0f}")

# US total final demand (column sums for USA)
us_y_cols = [c for c in eora.Y.columns if c[0] == "USA"]
us_final_demand = eora.Y[us_y_cols].sum().sum()
print(f"Raw US total final demand (Y): {us_final_demand:,.0f}")

# US value added = total output - intermediate inputs
us_z_cols = [c for c in eora.Z.columns if c[0] == "USA"]
us_intermediate = eora.Z[us_z_cols].sum().sum()
us_va = us_x - us_intermediate
print(f"Raw US value added (x - Z col sums): {us_va:,.0f}")
print(f"  Interpreted as '000 USD: ${us_va / 1e9:.1f} trillion")
print(f"  Interpreted as Mill USD: ${us_va / 1e6:.1f} trillion")

# World value added
world_x = eora.x.sum().sum()
world_intermediate = eora.Z.sum().sum()
world_va = world_x - world_intermediate
print(f"\nRaw world value added: {world_va:,.0f}")
print(f"  Interpreted as '000 USD: ${world_va / 1e9:.1f} trillion")
print(f"  Interpreted as Mill USD: ${world_va / 1e6:.1f} trillion")

Output

pymrio unit metadata: Mill USD

Raw US total output (x): 33,068,009,892
Raw US total final demand (Y): 19,768,436,862
Raw US value added (x - Z col sums): 19,061,267,619
  Interpreted as '000 USD: $19.1 trillion
  Interpreted as Mill USD: $19061.3 trillion

Raw world value added: 75,801,582,002
  Interpreted as '000 USD: $75.8 trillion
  Interpreted as Mill USD: $75801.6 trillion

Interpretation against benchmark

US GDP in 2017 was approximately $19.5 trillion (World Bank). Total value added should be slightly below GDP.

Interpretation US VA World VA
'000 USD (thousands) $19.1 trillion $75.8 trillion
Mill USD (millions) $19,061 trillion $75,802 trillion

The thousands interpretation matches the expected magnitude. The millions interpretation is off by a factor of 1,000.

Where the incorrect unit is set

In pymrio/tools/ioparser.py, the parse_eora26 function contains:

# bandwidth is set to 1 for eora
# based on personal communication, unit is set to Mill USD
io.unit = pd.DataFrame(
    data=["Mill USD"] * len(io.get_sectors()),
    index=io.get_sectors(),
    columns=["unit"],
)

This should be corrected to "000 USD" (thousands of USD) to match the actual data and the Eora FAQ, which states:

All monetary data are in '000 USD (thousands of US Dollars).

Environment

  • pymrio version: 0.5.4
  • Python 3.12
  • Eora26 2017 basic price tables (Eora26_2017_bp.zip)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions