Cluster tendency assessment in neuronal spike data

Sorting spikes from extracellular recording into clusters associated with distinct single units (putative neurons) is a fundamental step in analyzing neuronal populations. Such spike sorting is intrinsically unsupervised, as the number of neurons are not known a priori. Therefor, any spike sorting is an unsupervised learning problem that requires either of the two approaches: specification of a fixed value k for the number of clusters to seek, or generation of candidate partitions for several possible values of c, followed by selection of a best candidate based on various post-clustering validation criteria. In this paper, we investigate the first approach and evaluate the utility of several methods for providing lower dimensional visualization of the cluster structure and on subsequent spike clustering. We also introduce a visualization technique called improved visual assessment of cluster tendency (iVAT) to estimate possible cluster structures in data without the need for dimensionality reduction. Experimental results are conducted on two datasets with ground truth labels. In data with a relatively small number of clusters, iVAT is beneficial in estimating the number of clusters to inform the initialization of clustering algorithms. With larger numbers of clusters, iVAT gives a useful estimate of the coarse cluster structure but sometimes fails to indicate the presumptive number of clusters. We show that noise associated with recording extracellular neuronal potentials can disrupt computational clustering schemes, highlighting the benefit of probabilistic clustering models. Our results show that t-Distributed Stochastic Neighbor Embedding (t-SNE) provides representations of the data that yield more accurate visualization of potential cluster structure to inform the clustering stage. Moreover, The clusters obtained using t-SNE features were more reliable than the clusters obtained using the other methods, which indicates that t-SNE can potentially be used for both visualization and to extract features to be used by any clustering algorithm.

[1]  Kenneth D Harris,et al.  Improving data quality in neuronal population recordings , 2016, Nature Neuroscience.

[2]  E. Halgren,et al.  Single-neuron dynamics in human focal epilepsy , 2011, Nature Neuroscience.

[3]  Laurens van der Maaten,et al.  Accelerating t-SNE using tree-based algorithms , 2014, J. Mach. Learn. Res..

[4]  Lei Wang,et al.  On Similarity Preserving Feature Selection , 2013, IEEE Transactions on Knowledge and Data Engineering.

[5]  Sonja Grün,et al.  Robustness of the significance of spike synchrony with respect to sorting errors , 2006, Journal of Computational Neuroscience.

[6]  Fuyong Xing,et al.  Kernel machine tests of association between brain networks and phenotypes , 2019, PloS one.

[7]  Marimuthu Palaniswami,et al.  A visual-numeric approach to clustering and anomaly detection for trajectory data , 2017, The Visual Computer.

[8]  Kenneth D Harris,et al.  Spike sorting for large, dense electrode arrays , 2015, Nature Neuroscience.

[9]  Pierre Yger,et al.  Fast and accurate spike sorting in vitro and in vivo for up to thousands of electrodes , 2016, bioRxiv.

[10]  Matteo Carandini,et al.  Kilosort: realtime spike-sorting for extracellular electrophysiology with hundreds of channels , 2016, bioRxiv.

[11]  R. Kass,et al.  Multiple neural spike train data analysis: state-of-the-art and future challenges , 2004, Nature Neuroscience.

[12]  James C. Bezdek,et al.  Some new indexes of cluster validity , 1998, IEEE Trans. Syst. Man Cybern. Part B.

[13]  Giorgio A. Ascoli,et al.  Towards the automatic classification of neurons , 2015, Trends in Neurosciences.

[14]  M. Abeles,et al.  Multispike train analysis , 1977, Proceedings of the IEEE.

[15]  James G. King,et al.  Reconstruction and Simulation of Neocortical Microcircuitry , 2015, Cell.

[16]  Florian Mormann,et al.  Reliable Analysis of Single-Unit Recordings from the Human Brain under Noisy Conditions: Tracking Neurons over Hours , 2016, PloS one.

[17]  Adam R. Kampff,et al.  T-SNE visualization of large-scale neural recordings , 2016 .

[18]  Rufin van Rullen,et al.  Rate Coding Versus Temporal Order Coding: What the Retinal Ganglion Cells Tell the Visual Cortex , 2001, Neural Computation.

[19]  E. Halgren,et al.  Dynamic Balance of Excitation and Inhibition in Human and Monkey Neocortex , 2014, Scientific Reports.

[20]  Alessandro Laio,et al.  Clustering by fast search and find of density peaks , 2014, Science.

[21]  C. Guda,et al.  Global gene expression profiling of healthy human brain and its application in studying neurological disorders , 2017, Scientific Reports.

[22]  P. Campadelli,et al.  Intrinsic Dimension Estimation: Relevant Techniques and a Benchmark Framework , 2015 .

[23]  Yoshio Sakurai,et al.  Automatic sorting for multi-neuronal activity recorded with tetrodes in the presence of overlapping spikes. , 2003, Journal of neurophysiology.

[24]  James C. Bezdek,et al.  Interpreting Cluster Structure in Waveform Data with Visual Assessment and Dunn’s Index , 2018 .

[25]  Kenneth D Harris,et al.  Towards reliable spike-train recordings from thousands of neurons with multielectrodes , 2012, Current Opinion in Neurobiology.

[26]  R. Prim Shortest connection networks and some generalizations , 1957 .

[27]  John W. Sammon,et al.  A Nonlinear Mapping for Data Structure Analysis , 1969, IEEE Transactions on Computers.

[28]  Emery N. Brown,et al.  The BRAIN Initiative: developing technology to catalyse neuroscience discovery , 2015, Philosophical Transactions of the Royal Society B: Biological Sciences.

[29]  Ricardo J. G. B. Campello,et al.  Relative clustering validity criteria: A comparative overview , 2010 .

[30]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[31]  G. Buzsáki Large-scale recording of neuronal ensembles , 2004, Nature Neuroscience.

[32]  Marimuthu Palaniswami,et al.  A Hybrid Approach to Clustering in Big Data , 2016, IEEE Transactions on Cybernetics.

[33]  Olatz Arbelaitz,et al.  An extensive comparative study of cluster validity indices , 2013, Pattern Recognit..

[34]  R. Segev,et al.  How silent is the brain: is there a “dark matter” problem in neuroscience? , 2006, Journal of Comparative Physiology A.

[35]  Sandro Vega-Pons,et al.  A Survey of Clustering Ensemble Algorithms , 2011, Int. J. Pattern Recognit. Artif. Intell..

[36]  J. Csicsvari,et al.  Intracellular features predicted by extracellular recordings in the hippocampus in vivo. , 2000, Journal of neurophysiology.

[37]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[38]  Tomoki Fukai,et al.  Spike detection from noisy neural data in linear‐probe recordings , 2014, The European journal of neuroscience.

[39]  M. Cohen,et al.  Measuring and interpreting neuronal correlations , 2011, Nature Neuroscience.

[40]  M. Diamond,et al.  Complementary Contributions of Spike Timing and Spike Rate to Perceptual Decisions in Rat S1 and S2 Cortex , 2015, Current Biology.

[41]  Igor V Tetko,et al.  An unsupervised automatic method for sorting neuronal spike waveforms in awake and freely moving animals. , 2003, Methods.

[42]  M. R. Mehta,et al.  Role of experience and oscillations in transforming a rate code into a temporal code , 2002, Nature.

[43]  Kunal J. Paralikar,et al.  New approaches to eliminating common-noise artifacts in recordings from intracortical microelectrode arrays: Inter-electrode correlation and virtual referencing , 2009, Journal of Neuroscience Methods.

[44]  Julien Fournier,et al.  Consensus-Based Sorting of Neuronal Spike Waveforms , 2016, PloS one.

[45]  Dimitri M. Kullmann,et al.  Oscillatory multiplexing of population codes for selective communication in the mammalian brain , 2014, Nature Reviews Neuroscience.

[46]  J. Bezdek,et al.  VAT: a tool for visual assessment of (cluster) tendency , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[47]  John O'Keefe,et al.  Independent rate and temporal coding in hippocampal pyramidal cells , 2003, Nature.

[48]  Valérie Ventura,et al.  To sort or not to sort: the impact of spike-sorting on neural decoding performance , 2014, Journal of neural engineering.

[49]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[50]  J. C. Dunn,et al.  A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters , 1973 .

[51]  R. Quian Quiroga,et al.  Unsupervised Spike Detection and Sorting with Wavelets and Superparamagnetic Clustering , 2004, Neural Computation.

[52]  Rodrigo Quian Quiroga,et al.  How many neurons can we see with current spike sorting algorithms? , 2012, Journal of Neuroscience Methods.

[53]  James C. Bezdek,et al.  An Efficient Formulation of the Improved Visual Assessment of Cluster Tendency (iVAT) Algorithm , 2012, IEEE Transactions on Knowledge and Data Engineering.

[54]  James C. Bezdek,et al.  Cluster Tendency Assessment in Neuronal Spike Data , 2018 .

[55]  John F. Disterhoft,et al.  Robust hippocampal responsivity during retrieval of consolidated associative memory , 2015, Hippocampus.

[56]  Alexander Bertrand,et al.  Towards online spike sorting for high-density neural probes using discriminative template matching with suppression of interfering spikes , 2018, Journal of neural engineering.

[57]  V. Ventura,et al.  Accurately estimating neuronal correlation requires a new spike-sorting paradigm , 2012, Proceedings of the National Academy of Sciences.

[58]  J. Csicsvari,et al.  Accuracy of tetrode spike separation as determined by simultaneous intracellular and extracellular measurements. , 2000, Journal of neurophysiology.

[59]  C. Koch,et al.  The origin of extracellular fields and currents — EEG, ECoG, LFP and spikes , 2012, Nature Reviews Neuroscience.

[60]  James M. Keller,et al.  Is VAT really single linkage in disguise? , 2009, Annals of Mathematics and Artificial Intelligence.

[61]  Eero P. Simoncelli,et al.  A Model-Based Spike Sorting Algorithm for Removing Correlation Artifacts in Multi-Neuron Recordings , 2013, PloS one.

[62]  Yang Lei,et al.  Ground truth bias in external cluster validity indices , 2016, Pattern Recognit..