Improved space breakdown method – A robust clustering technique for spike sorting

Space Breakdown Method (SBM) is a clustering algorithm that was developed specifically for low-dimensional neuronal spike sorting. Cluster overlap and imbalance are common characteristics of neuronal data that produce difficulties for clustering methods. SBM is able to identify overlapping clusters through its design of cluster centre identification and the expansion of these centres. SBM’s approach is to divide the distribution of values of each feature into chunks of equal size. In each of these chunks, the number of points is counted and based on this number the centres of clusters are found and expanded. SBM has been shown to be a contender for other well-known clustering algorithms especially for the particular case of two dimensions while being too computationally expensive for high-dimensional data. Here, we present two main improvements to the original algorithm in order to increase its ability to deal with high-dimensional data while preserving its performance: the initial array structure was substituted with a graph structure and the number of partitions has been made feature-dependent, denominating this improved version as the Improved Space Breakdown Method (ISBM). In addition, we propose a clustering validation metric that does not punish overclustering and such obtains more suitable evaluations of clustering for spike sorting. Extracellular data recorded from the brain is unlabelled, therefore we have chosen simulated neural data, to which we have the ground truth, to evaluate more accurately the performance. Evaluations conducted on synthetic data indicate that the proposed improvements reduce the space and time complexity of the original algorithm, while simultaneously leading to an increased performance on neural data when compared with other state-of-the-art algorithms. Code available at https://github.com/ArdeleanRichard/Space-Breakdown-Method.

[1]  Thomas Bonald,et al.  Pairwise Adjusted Mutual Information , 2021, ArXiv.

[2]  Asim Bhatti,et al.  Compatibility Evaluation of Clustering Algorithms for Contemporary Extracellular Neural Spike Sorting , 2020, Frontiers in Systems Neuroscience.

[3]  Ana-Maria Ichim,et al.  Machine Learning-Assisted Detection of Action Potentials in Extracellular Multi-Unit Recordings , 2020, 2020 IEEE International Conference on Automation, Quality and Testing, Robotics (AQTR).

[4]  Witold Pedrycz,et al.  Fuzzy C-Means clustering through SSIM and patch for image segmentation , 2020, Appl. Soft Comput..

[5]  Rodica Potolea,et al.  Space Breakdown Method A new approach for density-based clustering , 2019, 2019 IEEE 15th International Conference on Intelligent Computer Communication and Processing (ICCP).

[6]  Xiangzhi Bai,et al.  Deviation-Sparse Fuzzy C-Means With Neighbor Information Constraint , 2019, IEEE Transactions on Fuzzy Systems.

[7]  J. Delgado-García,et al.  Spike sorting based on shape, phase, and distribution features, and K-TOPS clustering with validity and error indices , 2018, Scientific Reports.

[8]  Sergey L. Gratiy,et al.  Fully integrated silicon probes for high-density recording of neural activity , 2017, Nature.

[9]  Matteo Carandini,et al.  Kilosort: realtime spike-sorting for extracellular electrophysiology with hundreds of channels , 2016, bioRxiv.

[10]  Ashesh K Dhawale,et al.  Automated long-term recording and analysis of neural activity in behaving animals , 2016, bioRxiv.

[11]  Rodrigo Quian Quiroga,et al.  Past, present and future of spike sorting techniques , 2015, Brain Research Bulletin.

[12]  A. Barnett,et al.  Unimodal clustering using isotonic regression: ISO-SPLIT , 2015, 1508.04841.

[13]  Ricardo J. G. B. Campello,et al.  Density-Based Clustering Based on Hierarchical Density Estimates , 2013, PAKDD.

[14]  Rodrigo Quian Quiroga,et al.  How many neurons can we see with current spike sorting algorithms? , 2012, Journal of Neuroscience Methods.

[15]  Danko Nikolić,et al.  Membrane Resonance Enables Stable and Robust Gamma Oscillations , 2012, Cerebral cortex.

[16]  R. Quiroga Spike sorting , 2012, Current Biology.

[17]  Christian Sohler,et al.  Analysis of Agglomerative Clustering , 2010, Algorithmica.

[18]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[19]  James Bailey,et al.  Information Theoretic Measures for Clusterings Comparison: Variants, Properties, Normalization and Correction for Chance , 2010, J. Mach. Learn. Res..

[20]  James Bailey,et al.  Information theoretic measures for clusterings comparison: is a correction for chance necessary? , 2009, ICML '09.

[21]  Dimitrios A. Adamos,et al.  Performance evaluation of PCA-based spike sorting algorithms , 2008, Comput. Methods Programs Biomed..

[22]  Julia Hirschberg,et al.  V-Measure: A Conditional Entropy-Based External Cluster Evaluation Measure , 2007, EMNLP.

[23]  D. Steinley Properties of the Hubert-Arabie adjusted Rand index. , 2004, Psychological methods.

[24]  R. Eisenberg,et al.  Electrophysiology , 2003, Science.

[25]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[26]  M S Lewicki,et al.  A review of methods for spike sorting: the detection and classification of neural action potentials. , 1998, Network.

[27]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[28]  B. McNaughton,et al.  Tetrodes markedly improve the reliability and yield of multiple single-unit isolation from multi-unit recordings in cat striate cortex , 1995, Journal of Neuroscience Methods.

[29]  Yizong Cheng,et al.  Mean Shift, Mode Seeking, and Clustering , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[30]  M. Salganicoff,et al.  Unsupervised waveform classification for multi-neuron recordings: a real-time, software-based system. I. Algorithms and implementation , 1988, Journal of Neuroscience Methods.

[31]  M. F. Sarna,et al.  Unsupervised waveform classification for multi-neuron recordings: a real-time, software-based system. II. Performance comparison to other sorters , 1988, Journal of Neuroscience Methods.

[32]  C. Mallows,et al.  A Method for Comparing Two Hierarchical Clusterings , 1983 .

[33]  Devi Prasanna Swain,et al.  Principal Component Analysis , 2017 .

[34]  L. Hubert,et al.  Comparing partitions , 1985 .

[35]  J. Bezdek,et al.  FCM: The fuzzy c-means clustering algorithm , 1984 .

[36]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .