The TUH EEG CORPUS: A big data resource for automated EEG interpretation

The Neural Engineering Data Consortium (NEDC) is releasing its first major big data corpus - the Temple University Hospital EEG Corpus. This corpus consists of over 25,000 EEG studies, and includes a neurologist's interpretation of the test, a brief patient medical history and demographic information about the patient such as gender and age. For the first time, there is a sufficient amount of data to support the application of state of the art machine learning algorithms. In this paper, we present pilot results of experiments on the prediction of some basic attributes of an EEG from the raw EEG signal data using a 3,762 session subset of the corpus. Standard machine learning approaches are shown to be capable of predicting commonly occurring events from simple features with high accuracy on closed-loop testing, and can deliver error rates below 50% on a 6-way open set classification problem. This is very promising performance since commercial technology fails on this data.

[1]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[2]  J. Picone,et al.  Continuous speech recognition using hidden Markov models , 1990, IEEE ASSP Magazine.

[3]  Brian Litt,et al.  Semi-Supervised Anomaly Detection for EEG Waveforms Using Deep Belief Nets , 2010, 2010 Ninth International Conference on Machine Learning and Applications.

[4]  D. Wulsin,et al.  Bayesian nonparametric modeling of epileptic events , 2013 .

[5]  Joseph Picone,et al.  The Temple University Hospital EEG corpus , 2013, 2013 IEEE Global Conference on Signal and Information Processing.