A review on cluster estimation methods and their application to neural spike data

The extracellular action potentials recorded on an electrode result from the collective simultaneous electrophysiological activity of an unknown number of neurons. Identifying and assigning these action potentials to their firing neurons-'spike sorting'-is an indispensable step in studying the function and the response of an individual or ensemble of neurons to certain stimuli. Given the task of neural spike sorting, the determination of the number of clusters (neurons) is arguably the most difficult and challenging issue, due to the existence of background noise and the overlap and interactions among neurons in neighbouring regions. It is not surprising that some researchers still rely on visual inspection by experts to estimate the number of clusters in neural spike sorting. Manual inspection, however, is not suitable to processing the vast, ever-growing amount of neural data. To address this pressing need, in this paper, thirty-three clustering validity indices have been comprehensively reviewed and implemented to determine the number of clusters in neural datasets. To gauge the suitability of the indices to neural spike data, and inform the selection process, we then calculated the indices by applying k-means clustering to twenty widely used synthetic neural datasets and one empirical dataset, and compared the performance of these indices against pre-existing ground truth labels. The results showed that the top five validity indices work consistently well across variations in noise level, both for the synthetic datasets and the real dataset. Using these top performing indices provides strong support for the determination of the number of neural clusters, which is essential in the spike sorting process.

[1]  Michalis Vazirgiannis,et al.  c ○ 2001 Kluwer Academic Publishers. Manufactured in The Netherlands. On Clustering Validation Techniques , 2022 .

[2]  S. Dolnicar,et al.  An examination of indexes for determining the number of clusters in binary data sets , 2002, Psychometrika.

[3]  Saeid Nahavandi,et al.  Automatic spike sorting by unsupervised clustering with diffusion maps and silhouettes , 2015, Neurocomputing.

[4]  W. Krzanowski,et al.  A Criterion for Determining the Number of Groups in a Data Set Using Sum-of-Squares Clustering , 1988 .

[5]  Sanghamitra Bandyopadhyay,et al.  Application of a New Symmetry-Based Cluster Validity Index for Satellite Image Segmentation , 2008, IEEE Geoscience and Remote Sensing Letters.

[6]  D. N. Sparks Euclidean Cluster Analysis , 1973 .

[7]  Pierre Gançarski,et al.  A Collaborative Approach to Combine Multiple Learning Methods , 2000, Int. J. Artif. Intell. Tools.

[8]  R. Segev,et al.  A method for spike sorting and detection based on wavelet packets and Shannon's mutual information , 2002, Journal of Neuroscience Methods.

[9]  Ueli Rutishauser,et al.  Online detection and sorting of extracellularly recorded action potentials in human medial temporal lobe recordings, in vivo , 2006, Journal of Neuroscience Methods.

[10]  Sanghamitra Bandyopadhyay,et al.  Some connectivity based cluster validity indices , 2012, Appl. Soft Comput..

[11]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[12]  G. W. Milligan,et al.  A monte carlo study of thirty internal criterion measures for cluster analysis , 1981 .

[13]  Rodrigo Quian Quiroga,et al.  How many neurons can we see with current spike sorting algorithms? , 2012, Journal of Neuroscience Methods.

[14]  Michael J. Black,et al.  A nonparametric Bayesian alternative to spike sorting , 2008, Journal of Neuroscience Methods.

[15]  D. Novak,et al.  Identifying number of neurons in extracellular recording , 2009, 2009 4th International IEEE/EMBS Conference on Neural Engineering.

[16]  Anil K. Jain Data clustering: 50 years beyond K-means , 2010, Pattern Recognit. Lett..

[17]  R. Quian Quiroga,et al.  Unsupervised Spike Detection and Sorting with Wavelets and Superparamagnetic Clustering , 2004, Neural Computation.

[18]  T. Caliński,et al.  A dendrite method for cluster analysis , 1974 .

[19]  John A. Hartigan,et al.  Clustering Algorithms , 1975 .

[20]  A. Raftery,et al.  Model-based Gaussian and non-Gaussian clustering , 1993 .

[21]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[22]  V. Batagelj,et al.  Comparing resemblance measures , 1995 .

[23]  Gerardo Beni,et al.  A Validity Measure for Fuzzy Clustering , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Michalis Vazirgiannis,et al.  Quality Scheme Assessment in the Clustering Process , 2000, PKDD.

[25]  D.J. Sebald,et al.  Automatic Spike Sorting For Real-time Applications , 2007, 2007 3rd International IEEE/EMBS Conference on Neural Engineering.

[26]  Hui Xiong,et al.  Understanding and Enhancement of Internal Clustering Validation Measures , 2013, IEEE Transactions on Cybernetics.

[27]  J. Csicsvari,et al.  Accuracy of tetrode spike separation as determined by simultaneous intracellular and extracellular measurements. , 2000, Journal of neurophysiology.

[28]  Zhi Yang,et al.  Unsupervised spike sorting based on discriminative subspace learning , 2014, 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[29]  Z Tiganj,et al.  A non-parametric method for automatic neural spike clustering based on the non-uniform distribution of the data. , 2011, Journal of neural engineering.

[30]  James C. Bezdek,et al.  Some new indexes of cluster validity , 1998, IEEE Trans. Syst. Man Cybern. Part B.

[31]  Geoffrey H. Ball,et al.  ISODATA, A NOVEL METHOD OF DATA ANALYSIS AND PATTERN CLASSIFICATION , 1965 .

[32]  Mohamed Ben Ahmed,et al.  On the Number of Clusters in Block Clustering Algorithms , 2010, FLAIRS Conference.

[33]  Yasser Ghanbari,et al.  Graph-spectrum-based neural spike features for stereotrodes and tetrodes , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[34]  J. Dunn Well-Separated Clusters and Optimal Fuzzy Partitions , 1974 .

[35]  Dimitrios A. Adamos,et al.  Performance evaluation of PCA-based spike sorting algorithms , 2008, Comput. Methods Programs Biomed..

[36]  Siddheswar Ray,et al.  Determination of Number of Clusters in K-Means Clustering and Application in Colour Image Segmentation , 2000 .

[37]  H. P. Friedman,et al.  On Some Invariant Criteria for Grouping Data , 1967 .

[38]  Chenhui Yang,et al.  The M-Sorter: An automatic and robust spike detection and classification system , 2012, Journal of Neuroscience Methods.

[39]  Ujjwal Maulik,et al.  Validity index for crisp and fuzzy clusters , 2004, Pattern Recognit..

[40]  K. alik,et al.  Validity index for clusters of different sizes and densities , 2011 .

[41]  Minho Kim,et al.  New indices for cluster validity assessment , 2005, Pattern Recognit. Lett..

[42]  R. Kass,et al.  Multiple neural spike train data analysis: state-of-the-art and future challenges , 2004, Nature Neuroscience.

[43]  Olatz Arbelaitz,et al.  Towards a standard methodology to evaluate internal cluster validity indices , 2011, Pattern Recognit. Lett..

[44]  Jennie Si,et al.  Robust spike classification based on frequency domain neural waveform features , 2013, Journal of neural engineering.

[45]  M. Kendall A NEW MEASURE OF RANK CORRELATION , 1938 .

[46]  Ujjwal Maulik,et al.  Performance Evaluation of Some Clustering Algorithms and Validity Indices , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[47]  Z Tiganj,et al.  Neural spike sorting using iterative ICA and a deflation-based approach , 2012, Journal of neural engineering.

[48]  C. P. Lim,et al.  Unified selective sorting approach to analyse multi-electrode extracellular data , 2016, Scientific Reports.

[49]  G. W. Milligan,et al.  An examination of procedures for determining the number of clusters in a data set , 1985 .

[50]  T. Frey,et al.  A Cluster Analysis of the D 2 Matrix of White Spruce Stands in Saskatchewan Based on the Maximum-Minimum Principle , 1972 .

[51]  G. Buzsáki Large-scale recording of neuronal ensembles , 2004, Nature Neuroscience.

[52]  A. Scott,et al.  Clustering methods based on likelihood ratio criteria. , 1971 .

[53]  Saeid Nahavandi,et al.  Spike sorting using locality preserving projection with gap statistics and landmark-based spectral clustering , 2014, Journal of Neuroscience Methods.

[54]  F. Marriott Practical problems in a method of cluster analysis. , 1971, Biometrics.

[55]  Imali T. Hettiarachchi,et al.  Chaotic synchronization of time-delay coupled Hindmarsh–Rose neurons via nonlinear control , 2016 .

[56]  Kenneth D Harris,et al.  Improving data quality in neuronal population recordings , 2016, Nature Neuroscience.

[57]  R B Reilly,et al.  Automated spike sorting algorithm based on Laplacian eigenmaps and k-means clustering , 2011, Journal of neural engineering.

[58]  Yehezkel Yeshurun,et al.  An automatic measure for classifying clusters of suspected spikes into single cells versus multiunits , 2009, Journal of neural engineering.

[59]  Kenneth D Harris,et al.  Spike sorting for large, dense electrode arrays , 2015, Nature Neuroscience.

[60]  Sam C. Brown,et al.  Comparison of measures for the estimation of clustering in free recall. , 1971 .

[61]  Saeid Nahavandi,et al.  Hierarchical estimation of neural activity through explicit identification of temporally synchronous spikes , 2017, Neurocomputing.

[62]  Adrian E. Raftery,et al.  How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis , 1998, Comput. J..

[63]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[64]  G. W. Milligan,et al.  A NOTE ON PROCEDURES FOR TESTING THE QUALITY OF A CLUSTERING OF A SET OF OBJECTS , 1980 .

[65]  Michael S. Lewicki,et al.  Bayesian Modeling and Classification of Neural Signals , 1993, Neural Computation.

[66]  Olatz Arbelaitz,et al.  An extensive comparative study of cluster validity indices , 2013, Pattern Recognit..

[67]  M S Lewicki,et al.  A review of methods for spike sorting: the detection and classification of neural action potentials. , 1998, Network.

[68]  Dejan Markovic,et al.  Spike Sorting: The First Step in Decoding the Brain: The first step in decoding the brain , 2012, IEEE Signal Processing Magazine.

[69]  John P. Donoghue,et al.  Automated spike sorting using density grid contour clustering and subtractive waveform decomposition , 2007, Journal of Neuroscience Methods.

[70]  Daniel Chicharro,et al.  Monitoring spike train synchrony , 2012, Journal of neurophysiology.

[71]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[72]  Shy Shoham,et al.  Robust, automatic spike sorting using mixtures of multivariate t-distributions , 2003, Journal of Neuroscience Methods.