All SC2 structures, units, abilities, researches and dependencies between these, in machine readable JSON format.
This repository contains Python scripts for data generation.
- Install StarCraft II (and for linux set environment variables) similar to the instructions here
- Install python3.9 or newer
- Install
uvviapip install uv
The Python code to generate new data is the directory generate.
You can run
uv run --env-file=.env python run.pyto generate a new /data/data.json.
From the SC2 data, .xml files can be extracted. Use the Dockerfile for this step:
docker build -t stormex-image ./src
docker run -v "path/to/starcraft/StarCraft II:/data/sc2data:ro" -v ./src/xml:/data/output stormex-image /data/sc2data -s .xml -x -o /data/output# Creates src/extracted/stableid.json
docker run -v "path/to/starcraft/StarCraft II:/data/sc2data:ro" -v ./src/extracted:/data/output/mods/core.sc2mod/base.sc2data/GameData stormex-image /data/sc2data -s stableid.json -x -o /data/outputConvert the data from .xml to .json with
# Creates .../*Data.json
uv run src/xml_to_json.pyThen merge relevant .json files using order
liberty.sc2mod -> libertymulti.sc2mod -> swarm.sc2mod -> swarmmulti.sc2mod -> void.sc2mod -> voidmulti.sc2mod
Run
# Creates src/merged/*Data.xml
uv run src/merge_json.pyFrom here we can generate the techtree (all units, all abilities)
# Creates src/json/techtree.json
uv run src/generate_techtree.pyA smaller version of the techtree (with only real units and hardcoded suppressions) can be generated with
# Creates src/computed/data.json
uv run src/reconstruct_data.pyAll in one:
uv run src/xml_to_json.py && uv run src/merge_json.py && uv run src/generate_techtree.py && uv run src/reconstruct_data.pyResulting files should be:
src/
├── Dockerfile
├── merge_xml.py
├── convert_xml_to_json.py
├── xml/ # srced from SC2
│ ├── campaigns/
│ └── mods/ # Load order
│ ├── liberty.sc2mod/
│ ├── libertymulti.sc2mod/
│ ├── balancemulti.sc2mod/
│ └── voidmulti.sc2mod/
├── merged/ # Merged XML
└── json/ # Final JSON outputPlease open a new issue in GitHub.
Pull requests to fix things or for extensions are welcome as well, although I suggest asking me first by opening an issue or otherwise. The data model changes are usually quite hard to get right, and the data collection script itself is quite complicated and full of edge cases.