Modulation-Aware Key Estimator

Region-wise key estimation for songs that change key

Most key-estimation demos force one global label onto a whole track. This repo keeps the section boundary visible: it estimates likely key regions, reports candidate modulation points, and can pitch-shift each region toward a target key.

{
  "target_key_name": "C",
  "modulation_points": [{"time_sec": 74.24}],
  "region_infos": [
    {"start_time_sec": 0.0, "end_time_sec": 74.24, "key_name": "G", "confidence": 0.82},
    {"start_time_sec": 74.24, "end_time_sec": 181.76, "key_name": "A", "confidence": 0.77}
  ]
}

What It Does

extracts chroma and HPCP-style harmonic pitch-class features
runs a two-stream Transformer checkpoint
predicts 12 pitch-class keys per audio window
groups windows into likely key regions
exposes approximate modulation points
serves local CLI and FastAPI inference
downloads the release checkpoint with SHA-256 verification

Install

git clone https://github.com/SihyeonJeon/Modulation-aware-key-estimator.git
cd Modulation-aware-key-estimator
python -m venv .venv
source .venv/bin/activate
pip install -e .

The checkpoint downloads on first use from the GitHub release and is cached under ~/.cache/modulation-aware-key-estimator/.

Use a local checkpoint instead:

MODEL_CHECKPOINT_PATH=/path/to/key_model.pt mod-key-estimator --wav song.wav --json

CLI

Local file:

mod-key-estimator --wav song.wav --target-key c --json

YouTube URL through yt-dlp:

mod-key-estimator --youtube-url "https://www.youtube.com/watch?v=..." --target-key f#

If a video requires browser cookies, pass them explicitly:

mod-key-estimator --youtube-url "https://www.youtube.com/watch?v=..." --cookies ./cookies.txt

No cookies file is stored in this repository.

API

uvicorn modulation_key_estimator.api:app --host 0.0.0.0 --port 8000

curl -X POST http://localhost:8000/analyze-file \
  -F "file=@song.wav" \
  -F "target_key=c"

curl -X POST http://localhost:8000/analyze-youtube \
  -H "content-type: application/json" \
  -d '{"youtube_url":"https://www.youtube.com/watch?v=...","target_key":"c"}'

Docker

docker build -t modulation-key-estimator .
docker run --rm -p 8000:8000 modulation-key-estimator

Model Surface

Item	Value
input	mono audio, resampled to 16 kHz
features	chroma + HPCP-style 12-bin harmonic features
architecture	two-stream Transformer encoder with attention pooling
output	12 pitch-class probabilities per window
regioning	probability-shift grouping across neighboring windows
checkpoint	GitHub Release asset with SHA-256 verification

See docs/model-card.md for intended use, limitations, and failure modes.

Evaluation

Run a labeled manifest:

python scripts/evaluate_manifest.py examples/manifest.example.csv --json

Expected CSV columns:

path,expected_key
path/to/song.wav,c

The script reports exact pitch-class accuracy and per-file predictions. Replace the example manifest with local labeled audio before reporting a benchmark number.

Boundary

This repo currently ships the inference package, model architecture, release checkpoint, and manifest-based evaluation script. It does not yet ship the original training code, training manifest, dataset list, or training logs.

The checkpoint predicts pitch class only: C, C#, ..., B. It does not model major/minor, modal function, enharmonic spelling, or score-level harmonic analysis.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.github/workflows		.github/workflows
docs		docs
examples		examples
modulation_key_estimator		modulation_key_estimator
scripts		scripts
tests		tests
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
Mod_aware_KE_Colab.ipynb		Mod_aware_KE_Colab.ipynb
README.md		README.md
SECURITY.md		SECURITY.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Modulation-Aware Key Estimator

What It Does

Install

CLI

API

Docker

Model Surface

Evaluation

Boundary

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Modulation-Aware Key Estimator

What It Does

Install

CLI

API

Docker

Model Surface

Evaluation

Boundary

About

Topics

Resources

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages