-
Notifications
You must be signed in to change notification settings - Fork 16
Description
Modern spike sorting pipelines commonly output quality assessments for each unit. Kilosort uses categorical labels ("good", "mua", "noise") as documented in the phy manual clustering guide, while other pipelines like the IBL spike sorting pipeline use continuous quality scores derived from metrics such as contamination, drift, and missed spikes fraction. Currently, there is no standardized way to store this information in NWB, leading to inconsistent practices across the community.
We (@Alejo91 and me) propose adding two optional reserved columns to the Units table: quality_label (text, for categorical labels) and quality_score (float32, for continuous metrics). This would provide a canonical, discoverable location for quality information that downstream tools can reliably query. The columns would be optional to maintain backward compatibility. Alternatively, we could establish this as a best practice without modifying the schema.
A related consideration is whether the Units table should contain pre-curation or post-curation results. Many users store all sorted units before manual curation. Having standardized quality columns would allow users to store all units while providing a clear mechanism for filtering to high-quality units, which we could then document as a best practice.
c.c. @alejoe91 @bendichter