Classification of Electroencephalograph Data: A Hubness-aware Approach

Classification of electroencephalograph (EEG) data is the common denominator in various recognition tasks related to EEG signals. Automated recognition systems are especially useful in cases when continuous, long-term EEG is recorded and the resulting data, due to its huge amount, cannot be analyzed by human experts in depth. EEG-related recognition tasks may support medical diagnosis and they are core components of EEGcontrolled devices such as web browsers or spelling devices for paralyzed patients. Stateof-the-art solutions are based on machine learning. In this paper, we show that EEG datasets contain hubs, i.e., signals that appear as nearest neighbors of surprisingly many signals. This paper is the first to document this observation for EEG datasets. Next, we argue that the presence of hubs has to be taken into account for the classification of EEG signals, therefore, we adapt hubness-aware classifiers to EEG data. Finally, we present the results of our empirical study on a large, publicly available collection of EEG signals and show that hubness-aware classifiers outperform the state-of-the-art time-series classifier.

[1]  César A. Hidalgo,et al.  Scale-free networks , 2008, Scholarpedia.

[2]  G. Rondouin,et al.  Diagnostic value of quantitative EEG in Alzheimer’s disease , 2001, Neurophysiologie Clinique/Clinical Neurophysiology.

[3]  Dunja Mladenic,et al.  Nearest neighbor voting in high dimensional data: Learning from past occurrences , 2012, Comput. Sci. Inf. Syst..

[4]  Reza Boostani,et al.  Entropy and complexity measures for EEG signal classification of schizophrenic and control participants , 2009, Artif. Intell. Medicine.

[5]  Jerzy Stefanowski,et al.  Identification of Different Types of Minority Class Examples in Imbalanced Data , 2012, HAIS.

[6]  Devavrat Shah,et al.  A Latent Source Model for Nonparametric Time Series Classification , 2013, NIPS.

[7]  Sándor Beniczky,et al.  Diagnostic usefulness and duration of the inpatient long-term video-EEG monitoring: Findings in patients extensively investigated before the monitoring , 2009, Seizure.

[8]  Lars Schmidt-Thieme,et al.  INSIGHT: Efficient and Effective Instance Selection for Time-Series Classification , 2011, PAKDD.

[9]  Li Wei,et al.  Fast time series classification using numerosity reduction , 2006, ICML.

[10]  P. G. Larsson,et al.  The value of multichannel MEG and EEG in the presurgical evaluation of 70 epilepsy patients , 2006, Epilepsy Research.

[11]  Dunja Mladenic,et al.  The Role of Hubs in Cross-Lingual Supervised Document Retrieval , 2013, PAKDD.

[12]  László Györfi,et al.  A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.

[13]  Jianzhi Zhang,et al.  Why Do Hubs Tend to Be Essential in Protein Networks? , 2006, PLoS genetics.

[14]  Steven Laureys,et al.  Electroencephalographic profiles for differentiation of disorders of consciousness , 2013, Biomedical engineering online.

[15]  W. Tatum Long-Term EEG Monitoring: A Clinical Approach to Electrophysiology , 2001, Journal of clinical neurophysiology : official publication of the American Electroencephalographic Society.

[16]  C. Hahn,et al.  Continuous EEG monitoring in the neonatal intensive care unit. , 2013, Journal of clinical neurophysiology : official publication of the American Electroencephalographic Society.

[17]  Dunja Mladenic,et al.  Hub Co-occurrence Modeling for Robust High-Dimensional kNN Classification , 2013, ECML/PKDD.

[18]  Dunja Mladenic,et al.  Image Hub Explorer: Evaluating Representations and Metrics for Content-Based Image Retrieval and Object Recognition , 2013, ECML/PKDD.

[19]  E. Rodin,et al.  Interictal infraslow activity in patients with epilepsy , 2014, Clinical Neurophysiology.

[20]  Albert-László Barabási,et al.  Linked - how everything is connected to everything else and what it means for business, science, and everyday life , 2003 .

[21]  Stefan Haufe,et al.  EEG potentials predict upcoming emergency brakings during simulated driving , 2011, Journal of neural engineering.

[22]  M Poulos,et al.  Person Identification from the EEG using Nonlinear Signal Classification , 2002, Methods of Information in Medicine.

[23]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[24]  Nenad Tomašev EXPLORING THE HUBNESS-RELATED PROPERTIES OF OCEANOGRAPHIC SENSOR DATA , 2011 .

[25]  Reza Boostani,et al.  A new approach for EEG signal classification of schizophrenic and control participants , 2011, Expert Syst. Appl..

[26]  Nenad Tomašev,et al.  Hubness-based fuzzy measures for high-dimensional k-nearest neighbor classification , 2014 .

[27]  C. Perry Clinical Features , 2004, Bristol medico-chirurgical journal.

[28]  Pradeep Ravikumar,et al.  A Comparison of String Distance Metrics for Name-Matching Tasks , 2003, IIWeb.

[29]  Krisztian Buza,et al.  Fusion Methods for Time-Series Classification , 2011 .

[30]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[31]  H. Flor,et al.  A spelling device for the paralysed , 1999, Nature.

[32]  Andrzej Cichocki,et al.  A comparative study of synchrony measures for the early diagnosis of Alzheimer's disease based on EEG , 2010, NeuroImage.

[33]  V. Srinivasan,et al.  Artificial Neural Network Based Epileptic Detection Using Time-Domain and Frequency-Domain Features , 2005, Journal of Medical Systems.

[34]  Irina Rish,et al.  An empirical study of the naive Bayes classifier , 2001 .

[35]  Reza Boostani,et al.  An efficient classifier to diagnose of schizophrenia based on the EEG signals , 2009, Expert Syst. Appl..

[36]  Alexandros Nanopoulos,et al.  How does high dimensionality affect collaborative filtering? , 2009, RecSys '09.

[37]  Guido Rubboli,et al.  Neurophysiology of juvenile myoclonic epilepsy , 2013, Epilepsy & Behavior.

[38]  U. Rajendra Acharya,et al.  EEG Signal Analysis: A Survey , 2010, Journal of Medical Systems.

[39]  Lars Schmidt-Thieme,et al.  Fast Classification of Electrocardiograph Signals via Instance Selection , 2011, 2011 IEEE First International Conference on Healthcare Informatics, Imaging and Systems Biology.

[40]  Bruno Bergamasco,et al.  Clinical features, EEG findings and diagnostic pitfalls in juvenile myoclonic epilepsy: a series of 63 patients , 2001, Journal of the Neurological Sciences.

[41]  Dunja Mladenic,et al.  The Role of Hubness in Clustering High-Dimensional Data , 2011, IEEE Transactions on Knowledge and Data Engineering.

[42]  K. Sneppen,et al.  Degree landscapes in scale-free networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[43]  G. Kecklund,et al.  Sleepiness in long distance truck driving: an ambulatory EEG study of night driving. , 1993, Ergonomics.

[44]  François Pachet,et al.  Improving Timbre Similarity : How high’s the sky ? , 2004 .

[45]  Y. Nevo,et al.  The value of EEG in children with chronic headaches , 1994, Brain and Development.

[46]  Kristóf Marussy,et al.  Hubness-Aware Classification, Instance Selection and Feature Construction: Survey and Extensions to Time-Series , 2015, Feature Selection for Data and Pattern Recognition.

[47]  Dunja Mladenic,et al.  Class imbalance and the curse of minority hubs , 2013, Knowl. Based Syst..

[48]  Wolfgang Rosenstiel,et al.  Nessi: An EEG-Controlled Web Browser for Severely Paralyzed Patients , 2007, Comput. Intell. Neurosci..

[49]  Mike Tyers,et al.  Evolutionary and Physiological Importance of Hub Proteins , 2006, PLoS Comput. Biol..

[50]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[51]  Shiliang Sun,et al.  An experimental evaluation of ensemble methods for EEG signal classification , 2007, Pattern Recognit. Lett..