Curation Tutorial

After spike sorting and computing quality metrics, you can automatically curate the spike sorting output using the quality metrics that you have calculated.

Import the modules and/or functions necessary from spikeinterface

import spikeinterface.core as si

Let’s generate a simulated dataset, and imagine that the ground-truth sorting is in fact the output of a sorter.

recording, sorting = si.generate_ground_truth_recording()
print(recording)
print(sorting)

GroundTruthRecording (InjectTemplatesRecording): 4 channels - 25.0kHz - 1 segments
                      250,000 samples - 10.00s - float32 dtype - 3.81 MiB
GroundTruthSorting (NumpySorting): 10 units - 1 segments - 25.0kHz

Create SortingAnalyzer

For this example, we will need a SortingAnalyzer and some extensions to be computed first

analyzer = si.create_sorting_analyzer(sorting=sorting, recording=recording, format="memory")
analyzer.compute(["random_spikes", "waveforms", "templates", "noise_levels"])

analyzer.compute("principal_components", n_components=3, mode="by_channel_local")
print(analyzer)

estimate_sparsity (no parallelization):   0%|          | 0/10 [00:00<?, ?it/s]
estimate_sparsity (no parallelization): 100%|██████████| 10/10 [00:00<00:00, 315.16it/s]

compute_waveforms (no parallelization):   0%|          | 0/10 [00:00<?, ?it/s]
compute_waveforms (no parallelization): 100%|██████████| 10/10 [00:00<00:00, 278.31it/s]

noise_level (no parallelization):   0%|          | 0/20 [00:00<?, ?it/s]
noise_level (no parallelization): 100%|██████████| 20/20 [00:00<00:00, 281.44it/s]

Fitting PCA:   0%|          | 0/10 [00:00<?, ?it/s]
Fitting PCA: 100%|██████████| 10/10 [00:00<00:00, 139.83it/s]

Projecting waveforms:   0%|          | 0/10 [00:00<?, ?it/s]
Projecting waveforms: 100%|██████████| 10/10 [00:00<00:00, 1322.25it/s]
SortingAnalyzer: 4 channels - 10 units - 1 segments - memory - sparse - has recording
Loaded 5 extensions: random_spikes, waveforms, templates, noise_levels, principal_components

Then we compute some quality metrics:

metrics_ext = analyzer.compute("quality_metrics", metric_names=["snr", "isi_violation", "nearest_neighbor"])
metrics = metrics_ext.get_data()
print(metrics)

         snr  isi_violations_ratio  ...  nn_hit_rate  nn_miss_rate
23.191724                   0.0  ...     0.825000      0.015657
16.985655                   0.0  ...     0.891391      0.009559
22.953610                   0.0  ...     0.884768      0.011501
14.814512                   0.0  ...     0.859756      0.015837
 3.128944                   0.0  ...     0.787500      0.020558
61.535044                   0.0  ...     0.998630      0.001042
 7.535709                   0.0  ...     0.827586      0.026292
31.274599                   0.0  ...     0.927160      0.007681
17.983749                   0.0  ...     0.858647      0.017391
 4.399881                   0.0  ...     0.794483      0.022454

[10 rows x 5 columns]

We can now threshold each quality metric and select units based on some rules.

The easiest and most intuitive way is to use boolean masking with a dataframe.

Then create a list of unit ids that we want to keep

keep_mask = (metrics["snr"] > 7.5) & (metrics["isi_violations_ratio"] < 0.2) & (metrics["nn_hit_rate"] > 0.80)
print(keep_mask)

keep_unit_ids = keep_mask[keep_mask].index.values
keep_unit_ids = [unit_id for unit_id in keep_unit_ids]
print(keep_unit_ids)

   True
   True
   True
   True
  False
   True
   True
   True
   True
  False
dtype: bool
['0', '1', '2', '3', '5', '6', '7', '8']

And now let’s create a sorting that contains only curated units and save it.

curated_sorting = sorting.select_units(keep_unit_ids)
print(curated_sorting)


curated_sorting.save(folder="curated_sorting", overwrite=True)

GroundTruthSorting (UnitsSelectionSorting): 8 units - 1 segments - 25.0kHz

NumpyFolder (NumpyFolderSorting): 8 units - 1 segments - 25.0kHz

Unit IDs

['0' '1' '2' '3' '5' '6' '7' '8']

Annotations

name : GroundTruthSorting

Properties

gt_unit_locations

[[15.597422 2.885378 31.72171 ] [-2.775942 22.569551 12.428095 ] [-8.279635 -9.791637 40.27413 ] [21.658812 -5.0703335 24.142244 ] [17.32016 19.11672 7.7861094 ] [16.671326 -3.6040514 43.83887 ] [-5.62869 0.75520146 12.220262 ] [26.915827 13.710055 47.351532 ]]

We can also save the analyzer with only theses units

clean_analyzer = analyzer.select_units(unit_ids=keep_unit_ids, format="zarr", folder="clean_analyzer")

print(clean_analyzer)

SortingAnalyzer: 4 channels - 8 units - 1 segments - zarr - sparse - has recording
Loaded 6 extensions: random_spikes, waveforms, templates, noise_levels, principal_components, quality_metrics

Total running time of the script: (0 minutes 0.626 seconds)

Gallery generated by Sphinx-Gallery