Open-Set Speaker Identification Evaluation Set Generator

This repository provides tools to generate diverse evaluation sets for open-set speaker identification experiments using the VoxBlink2 dataset. From a pre-selected pool of 500 speakers, you can create multiple enrollment and evaluation sets by randomly sampling speakers.

Overview

Open-set speaker identification is a task where the system must identify a speaker from a set of enrolled speakers, while also detecting when a test utterance belongs to an unknown (non-enrolled) speaker. This tool enables reproducible experiments by generating multiple trials with different speaker subsets.

Files

File	Description
`make_evalset.py`	Script to generate enrollment and evaluation JSON files
`spk_enroll_info.json`	Pre-selected enrollment utterances for 500 speakers
`spk_eval_info.json`	Pre-selected evaluation utterances for 500 speakers

Requirements

Python 3.7+
No additional dependencies required (uses only standard library)

Usage

python make_evalset.py --num_speakers <N> [--num_trials <T>] [--seed <S>] [--output_dir <DIR>]

Arguments

Argument	Required	Default	Description
`--num_speakers`	Yes	-	Number of enrolled speakers (1-300)
`--num_trials`	No	100	Number of evaluation sets to generate
`--seed`	No	42	Random seed for reproducibility
`--output_dir`	No	`evalsets`	Output directory

Example

# Generate 100 evaluation sets with 300 enrolled speakers
python make_evalset.py --num_speakers 300

# Generate 50 evaluation sets with 100 enrolled speakers
python make_evalset.py --num_speakers 100 --num_trials 50 --seed 123

Output Format

The script creates the following directory structure:

evalsets/
└── num_spk_<N>/
    ├── enroll_000.json
    ├── eval_000.json
    ├── enroll_001.json
    ├── eval_001.json
    ├── ...
    ├── enroll_<T-1>.json
    └── eval_<T-1>.json

<N>: Number of enrolled speakers (--num_speakers)
<T>: Number of trials (--num_trials, default: 100)

Enrollment JSON (`enroll_XXX.json`)

{
    "speaker_id_1": ["path/to/utt1.wav", "path/to/utt2.wav", ...],
    "speaker_id_2": ["path/to/utt1.wav", ...],
    ...
}

Evaluation JSON (`eval_XXX.json`)

[
    {
        "audio_path": "path/to/utterance.wav",
        "label": "speaker_id" or "unknown",
        "speaker_id": "actual_speaker_id"
    },
    ...
]

label: Speaker ID if enrolled, "unknown" if not enrolled
speaker_id: Ground truth speaker ID (for analysis purposes)

Audio Data

This evaluation protocol is designed for the VoxBlink2 dataset. You need to download the VoxBlink2 dataset separately and ensure the audio paths in the JSON files are correctly mapped to your local setup.

License

This project is licensed under the MIT License.

Dataset License

The VoxBlink2 dataset is licensed under CC BY-NC-SA 4.0. Please ensure compliance with the dataset license when using it for your research.

MIT License

Copyright (c) 2026-present NAVER Cloud Corp.

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

References

VoxBlink2 Dataset

If you use this evaluation protocol, please cite the VoxBlink2 dataset:

@misc{lin2024voxblink2100kspeakerrecognition,
      title={VoxBlink2: A 100K+ Speaker Recognition Corpus and the Open-Set Speaker-Identification Benchmark}, 
      author={Yuke Lin and Ming Cheng and Fulin Zhang and Yingying Gao and Shilei Zhang and Ming Li},
      year={2024},
      eprint={2407.11510},
      archivePrefix={arXiv},
      primaryClass={eess.AS},
      url={https://arxiv.org/abs/2407.11510}, 
}

VoxBlink2 Resources:

Citation

If you use this evaluation set generator in your research, please cite:

@inproceedings{heo2026icassp,
    title={MITIGATING FALSE ALARMS IN OPEN-SET SPEAKER IDENTIFICATION WITH A DECOUPLED FRAMEWORK},
    author={Heo, Hee-Soo and Lee, Minjae and Kwon, Youngki and Kim, Han-Gyu and Lee, Bong-Jin},
    booktitle={Proc. ICASSP},
    year={2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
LICENSE		LICENSE
README.md		README.md
make_evalset.py		make_evalset.py
spk_enroll_info.json		spk_enroll_info.json
spk_eval_info.json		spk_eval_info.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Open-Set Speaker Identification Evaluation Set Generator

Overview

Files

Requirements

Usage

Arguments

Example

Output Format

Enrollment JSON (`enroll_XXX.json`)

Evaluation JSON (`eval_XXX.json`)

Audio Data

License

Dataset License

References

VoxBlink2 Dataset

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Open-Set Speaker Identification Evaluation Set Generator

Overview

Files

Requirements

Usage

Arguments

Example

Output Format

Enrollment JSON (enroll_XXX.json)

Evaluation JSON (eval_XXX.json)

Audio Data

License

Dataset License

References

VoxBlink2 Dataset

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Enrollment JSON (`enroll_XXX.json`)

Evaluation JSON (`eval_XXX.json`)

Packages