Empirical Evaluation on the Impact of Class Overlap for EEG-Based Early Epileptic Seizure Detection

Important physiological information is hidden in electroencephalography (EEG), which can reflect the human brain’s activity. EEG, which is a kind of complicated signal, can be used for epileptic seizure detection and epilepsy diagnosis via machine learning. A large amount of effort, including raw signal preprocessing and data preprocessing for machine learning, is required for constructing high-quality training datasets because the classification performance highly depends on high-quality data. Feature extraction has been widely used in EEG-based early epileptic seizure detection. Due to the complexity of data collection and labeling, some of the training instances are inevitably mislabeled. That means some similar instances have different labels. This is called the issue of class overlap, which leads to a poor class boundary for classification models and makes constructing a high-quality classification model more difficult. However, the previous studies investigating the impact of the class overlap for EEG data is quite limited. Our goal is to investigate the impact of the class overlap on EEG-based early epileptic seizure detection. We propose a special neighborhood cleaning rule (SNCR) to solve the class overlap issue. To alleviate the class overlap issue, we conduct large-scale experiments on two widely-used EEG datasets and compare our proposed SNCR strategy with a state-of-the-art data clean strategy, i.e., the improved $k$ -means clustering cleaning approach (IKMCCA). The experimental results show that the classification model can achieve significantly better performance in terms of AUC, recall, and F1 metrics when using our proposed SNCR strategy. Therefore, for EEG-based early epileptic seizure detection, we recommend researchers to apply the SNCR strategy to mitigate the class overlap issue and use the SNCR strategy to perform data preprocessing in a future related study.

[1]  Chen Xiang,et al.  Active Learning using Uncertainty Sampling and Query-by-Committee for Software Defect Prediction , 2019, International Journal of Performability Engineering.

[2]  Rohini K. Srihari,et al.  Feature selection for text categorization on imbalanced data , 2004, SKDD.

[3]  Christos Faloutsos,et al.  Coercively Adjusted Auto Regression Model for Forecasting in Epilepsy EEG , 2013, Comput. Math. Methods Medicine.

[4]  Taghi M. Khoshgoftaar,et al.  Noise identification with the k-means algorithm , 2004, 16th IEEE International Conference on Tools with Artificial Intelligence.

[5]  Shujuan Jiang,et al.  A Novel Class-Imbalance Learning Approach for Both Within-Project and Cross-Project Defect Prediction , 2020, IEEE Transactions on Reliability.

[6]  Kay A. Robbins,et al.  How Sensitive are EEG Results to Preprocessing Methods: A Benchmarking Study , 2020, bioRxiv.

[7]  Li Fang,et al.  Active Learning Empirical Research on Cross-Version Software Defect Prediction Datasets , 2020, Int. J. Perform. Eng..

[8]  Qin Lin,et al.  Classification of Epileptic EEG Signals with Stacked Sparse Autoencoder Based on Deep Learning , 2016, ICIC.

[9]  Jianbin Tang,et al.  SeizureNet: Multi-Spectral Deep Feature Learning for Seizure Type Classification , 2019, MLCN/RNO-AI@MICCAI.

[10]  Shujuan Jiang,et al.  Empirical Evaluation of the Impact of Class Overlap on Software Defect Prediction , 2019, 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[11]  Jin Pyo Hong,et al.  Seizure Frequencies and Number of Anti-epileptic Drugs as Risk Factors for Sudden Unexpected Death in Epilepsy , 2015, Journal of Korean medical science.

[12]  Miguel P. Eckstein,et al.  Predicting variations of perceptual performance across individuals from neural activity using pattern classifiers , 2010, NeuroImage.

[13]  Tiago H. Falk,et al.  Deep learning-based electroencephalography analysis: a systematic review , 2019, Journal of neural engineering.

[14]  Tracy Hall,et al.  Researcher Bias: The Use of Machine Learning in Software Defect Prediction , 2014, IEEE Transactions on Software Engineering.

[15]  Joseph Picone,et al.  The Temple University Hospital Seizure Detection Corpus , 2018, Front. Neuroinform..

[16]  Tonio Ball,et al.  EEG-GAN: Generative adversarial networks for electroencephalograhic (EEG) brain signals , 2018, ArXiv.

[17]  Jianbin Tang,et al.  Seizure Type Classification using EEG signals and Machine Learning: Setting a benchmark , 2019 .

[18]  Guoliang Lu,et al.  A Unified Framework and Method for EEG-Based Early Epileptic Seizure Detection and Epilepsy Diagnosis , 2020, IEEE Access.

[19]  L. D. Mitchell,et al.  Improved Methods for the Fast Fourier Transform (FFT) Calculation of the Frequency Response Function , 1982 .

[20]  K Lehnertz,et al.  Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: dependence on recording region and brain state. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[21]  Zhaowei Shang,et al.  Tackling class overlap and imbalance problems in software defect prediction , 2018, Software Quality Journal.

[22]  Yoshua Bengio,et al.  Learning deep physiological models of affect , 2013, IEEE Computational Intelligence Magazine.

[23]  Reza Tafreshi,et al.  Automated Real-Time Epileptic Seizure Detection in Scalp EEG Recordings Using an Algorithm Based on Wavelet Packet Transform , 2010, IEEE Transactions on Biomedical Engineering.

[24]  Ahmed El-Sherbeny,et al.  Literature Review on EEG Preprocessing, Feature Extraction, and Classifications Techniques , 2019 .

[25]  U. Rajendra Acharya,et al.  Automated EEG analysis of epilepsy: A review , 2013, Knowl. Based Syst..

[26]  Lars Petersson,et al.  Neural Memory Networks for Robust Classification of Seizure Type , 2019, ArXiv.

[27]  N. Birbaumer,et al.  On the Usage of Linear Regression Models to Reconstruct Limb Kinematics from Low Frequency EEG Signals , 2013, PloS one.

[28]  Yang Li,et al.  Epileptic Seizure Detection Based on Time-Frequency Images of EEG Signals Using Gaussian Mixture Model and Gray Level Co-Occurrence Matrix Features , 2018, Int. J. Neural Syst..

[29]  F. Mormann,et al.  Seizure prediction for therapeutic devices: A review , 2016, Journal of Neuroscience Methods.

[30]  Wolfram Burgard,et al.  Deep learning with convolutional neural networks for EEG decoding and visualization , 2017, Human brain mapping.

[31]  Khan M. Iftekharuddin,et al.  Deep recurrent neural network for seizure detection , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[32]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[33]  Liu Lon EEG signal denoising and feature extraction based on wavelet packet transform , 2015 .

[34]  Zeb Kurth-Nelson,et al.  Fast Sequences of Non-spatial State Representations in Humans , 2016, Neuron.

[35]  Seyyed Abed Hosseini,et al.  Qualitative and Quantitative Evaluation of EEG Signals in Epileptic Seizure Recognition , 2013 .