Data selection in EEG signals classification

The alcoholism can be detected by analyzing electroencephalogram (EEG) signals. However, analyzing multi-channel EEG signals is a challenging task, which often requires complicated calculations and long execution time. This paper proposes three data selection methods to extract representative data from the EEG signals of alcoholics. The methods are the principal component analysis based on graph entropy (PCA-GE), the channel selection based on graph entropy (GE) difference, and the mathematic combinations channel selection, respectively. For comparison purposes, the selected data from the three methods are then classified by three classifiers: the J48 decision tree, the K-nearest neighbor and the Kstar, separately. The experimental results show that the proposed methods are successful in selecting data without compromising the classification accuracy in discriminating the EEG signals from alcoholics and non-alcoholics. Among them, the proposed PCA-GE method uses only 29.69 % of the whole data and 29.5 % of the computation time but achieves a 94.5 % classification accuracy. The channel selection method based on the GE difference also gains a 91.67 % classification accuracy by using only 29.69 % of the full size of the original data. Using as little data as possible without sacrificing the final classification accuracy is useful for online EEG analysis and classification application design.

[1]  Yan Li,et al.  Measuring the hypnotic depth of anaesthesia based on the EEG signal using combined wavelet transform, eigenvector and normalisation techniques , 2012, Comput. Biol. Medicine.

[2]  K. Marinković,et al.  Alcohol: Effects on Neurobehavioral Functions and the Brain , 2007, Neuropsychology Review.

[3]  V. V. Hung A characterization of , 2016 .

[4]  J. Richman,et al.  Physiological time-series analysis using approximate entropy and sample entropy. , 2000, American journal of physiology. Heart and circulatory physiology.

[5]  J. Martinerie,et al.  Epileptic seizures can be anticipated by non-linear analysis , 1998, Nature Medicine.

[6]  B. Luque,et al.  Horizontal visibility graphs: exact results for random time series. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[7]  Steven Salzberg,et al.  Programs for Machine Learning , 2004 .

[8]  Peng Wen,et al.  Anaesthetic EEG signal denoise using improved nonlocal mean methods , 2014, Australasian Physical & Engineering Sciences in Medicine.

[9]  Abdulhamit Subasi,et al.  EEG signal classification using wavelet feature extraction and a mixture of expert model , 2007, Expert Syst. Appl..

[10]  Yan Li,et al.  Analysis of alcoholic EEG signals based on horizontal visibility graph entropy , 2014, Brain Informatics.

[11]  Yan Li,et al.  Evaluating Functional Connectivity in Alcoholics Based on Maximal Weight Matching , 2011, J. Adv. Comput. Intell. Intell. Informatics.

[12]  Karl Pearson F.R.S. LIII. On lines and planes of closest fit to systems of points in space , 1901 .

[13]  Ning Ye,et al.  EEG Analysis of Alcoholics and Controls Based on Feature Extraction , 2006, 2006 8th international Conference on Signal Processing.

[14]  Wu Di,et al.  Notice of RetractionStudy on human brain after consuming alcohol based on EEG signal , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[15]  Amitava Chatterjee,et al.  Cross-correlation aided support vector machine classifier for classification of EEG signals , 2009, Expert Syst. Appl..

[16]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[17]  L. F. Haas Hans Berger (1873–1941), Richard Caton (1842–1926), and electroencephalography , 2003, Journal of neurology, neurosurgery, and psychiatry.

[18]  C. J. Stam,et al.  Nonlinear EEG changes in postanoxic encephalopathy , 1999 .

[19]  Yan Li,et al.  EEG signal classification based on simple random sampling technique with least square support vector machine , 2011 .

[20]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[21]  Cornelis J. Stam,et al.  Non-linear analysis of the electroencephalogram in Creutzfeldt-Jakob disease , 1997, Biological Cybernetics.

[22]  J. Cracco Spehlmann's evoked potential primer: visual, auditory, and somatosensory evoked potentials in clinical diagnosis , 1995 .

[23]  Jeffrey M. Bradshaw,et al.  Brain Informatics , 2011 .

[24]  Kemal Polat,et al.  Classification of epileptiform EEG using a hybrid system based on decision tree classifier and fast Fourier transform , 2007, Appl. Math. Comput..

[25]  Guoyin Wang,et al.  Granular computing with multiple granular layers for brain big data processing , 2014, Brain Informatics.

[26]  Hiroshi Ohta,et al.  A Test for Normality Based on , 1989 .

[27]  Oldrich A Vasicek,et al.  A Test for Normality Based on Sample Entropy , 1976 .

[28]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[29]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[30]  C. Elger,et al.  CAN EPILEPTIC SEIZURES BE PREDICTED? EVIDENCE FROM NONLINEAR TIME SERIES ANALYSIS OF BRAIN ELECTRICAL ACTIVITY , 1998 .

[31]  J. Wackermann,et al.  Beyond mapping: estimating complexity of multichannel EEG recordings. , 1996, Acta neurobiologiae experimentalis.

[32]  Steven L. Salzberg,et al.  Book Review: C4.5: Programs for Machine Learning by J. Ross Quinlan. Morgan Kaufmann Publishers, Inc., 1993 , 1994, Machine Learning.

[33]  Claude E. Shannon,et al.  The mathematical theory of communication , 1950 .

[34]  Toshihisa Tanaka,et al.  Active Data Selection for Motor Imagery EEG Classification , 2015, IEEE Transactions on Biomedical Engineering.

[35]  Reinhard Diestel,et al.  Graph Theory , 1997 .

[36]  Yanchun Zhang,et al.  Exploring Sampling in the Detection of Multicategory EEG Signals , 2015, Comput. Math. Methods Medicine.

[37]  John G. Taylor,et al.  Vector Machines , 2002, Neural Networks and the Financial Markets.

[38]  Simone Severini,et al.  A characterization of horizontal visibility graphs and combinatorics on words , 2010, 1010.1850.

[39]  Yan Li,et al.  An Efficient Visibility Graph Similarity Algorithm and Its Application on Sleep Stages Classification , 2012, Brain Informatics.

[40]  Toshio Tsuji,et al.  A recurrent log-linearized Gaussian mixture network , 2003, IEEE Trans. Neural Networks.

[41]  Elif Derya Übeyli,et al.  Adaptive neuro-fuzzy inference system for classification of EEG signals using wavelet coefficients , 2005, Journal of Neuroscience Methods.