Sleep-CMKD: Self-Attention CNN/Transformer Cross-Model Knowledge Distillation for Automatic Sleep Staging

In this paper, we propose single-channel cross-model knowledge distillation(CMKD) method between convolutional neural networks-based and transformer-based models for automatic sleep staging. In sleep staging, few works proposed to distill knowledge from additional sleep dataset or multi-channel polysomnogram requiring manual scoring effort of human experts and limiting the home application senarios. Cross-model knowledge distillation avoids these additional efforts and limitations by distilling inductive biases between models with different structures. Experiments on Sleep-EDFX-78 dataset confirm that the proposed method improves sleep stage classification accuracy of transformer-based model by 1.7%.

[1]  Xiang Xie,et al.  Multichannel Multidomain-Based Knowledge Distillation Algorithm for Sleep Staging With Single-Channel EEG , 2022, IEEE Transactions on Circuits and Systems II: Express Briefs.

[2]  M. Sivaprakasam,et al.  EEG aided boosting of single-lead ECG based sleep staging with Deep Knowledge Distillation , 2022, 2022 IEEE International Symposium on Medical Measurements and Applications (MeMeA).

[3]  James R. Glass,et al.  CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio Classification , 2022, ArXiv.

[4]  Chenglu Sun,et al.  EOGNET: A Novel Deep Learning Model for Sleep Stage Classification Based on Single-Channel EOG Signal , 2021, Frontiers in Neuroscience.

[5]  Yonglong Tian,et al.  Co-advise: Cross Inductive Bias Distillation , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Rohan Anil,et al.  Knowledge distillation: A good teacher is patient and consistent , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Maarten De Vos,et al.  SleepTransformer: Automatic Sleep Staging With Interpretability and Uncertainty Quantification , 2021, IEEE Transactions on Biomedical Engineering.

[8]  Matthieu Cord,et al.  Training data-efficient image transformers & distillation through attention , 2020, ICML.

[9]  Xiang Xie,et al.  Design of a Hybrid Competition-Cooperation Teacher-Students Model for Single Channel Based Sleep Staging , 2020, 2020 IEEE International Symposium on Circuits and Systems (ISCAS).

[10]  Seunghyeok Back,et al.  Intra- and inter-epoch temporal context network (IITNet) using sub-epoch features for automatic sleep scoring on raw single-channel EEG , 2020, Biomed. Signal Process. Control..

[11]  Maarten De Vos,et al.  XSleepNet: Multi-View Sequential Model for Automatic Sleep Staging , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Feng Yu,et al.  Convolution- and Attention-Based Neural Network for Automated Sleep Stage Classification , 2020, International journal of environmental research and public health.

[13]  Willem Zuidema,et al.  Transferring Inductive Biases through Knowledge Distillation , 2020, ArXiv.

[14]  Maarten De Vos,et al.  Personalized automatic sleep staging with single-night data: a pilot study with Kullback–Leibler divergence regularization , 2020, Physiological measurement.

[15]  Wei Chen,et al.  MetaSleepLearner: A Pilot Study on Fast Adaptation of Bio-Signals-Based Sleep Stage Classifier to New Individual Subject Using Meta-Learning , 2020, IEEE Journal of Biomedical and Health Informatics.

[16]  K. Melehan,et al.  An Australasian Commentary on the AASM Manual for the Scoring of Sleep and Associated Events , 2020, Sleep and Biological Rhythms.

[17]  Preben Kidmose,et al.  Accurate whole-night sleep monitoring with dry-contact ear-EEG , 2019, Scientific Reports.

[18]  Maarten De Vos,et al.  Towards More Accurate Automatic Sleep Staging via Deep Transfer Learning , 2019, IEEE Transactions on Biomedical Engineering.

[19]  Stephen Clark,et al.  Scalable Syntax-Aware Language Models Using Knowledge Distillation , 2019, ACL.

[20]  Quoc V. Le,et al.  EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.

[21]  Maarten De Vos,et al.  Deep Transfer Learning for Single-Channel Automatic Sleep Staging with Channel Mismatch , 2019, 2019 27th European Signal Processing Conference (EUSIPCO).

[22]  U. Rajendra Acharya,et al.  SleepEEGNet: Automated sleep stage scoring with sequence to sequence deep learning approach , 2019, PloS one.

[23]  Fenglong Ma,et al.  Multivariate Sleep Stage Classification using Hybrid Self-Attentive Deep Learning Networks , 2018, 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[24]  Stefan Debener,et al.  Machine‐learning‐derived sleep–wake staging from around‐the‐ear electroencephalogram outperforms manual scoring and actigraphy , 2018, Journal of sleep research.

[25]  Oliver Y. Chén,et al.  SeqSleepNet: End-to-End Hierarchical Recurrent Neural Network for Sequence-to-Sequence Automatic Sleep Staging , 2018, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[26]  Maarten De Vos,et al.  DNN Filter Bank Improves 1-Max Pooling CNN for Single-Channel EEG Automatic Sleep Stage Classification , 2018, 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[27]  Maarten De Vos,et al.  Automatic Sleep Stage Classification Using Single-Channel EEG: Learning Sequential Features with Attention-Based Recurrent Neural Networks , 2018, 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[28]  Laurent Vercueil,et al.  A convolutional neural network for sleep stage scoring from raw single-channel EEG , 2018, Biomed. Signal Process. Control..

[29]  Kaare B. Mikkelsen,et al.  Personalizing deep learning models for automatic sleep staging , 2018, 1801.02645.

[30]  Frank Hutter,et al.  Decoupled Weight Decay Regularization , 2017, ICLR.

[31]  Olga Sourina,et al.  Large-Scale Automated Sleep Staging , 2017, Sleep.

[32]  Stanislas Chambon,et al.  A Deep Learning Architecture for Temporal Sleep Stage Classification Using Multivariate and Multimodal Time Series , 2017, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[33]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[34]  Hao Dong,et al.  Mixed Neural Network Approach for Temporal Sleep Stage Classification , 2016, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[35]  Yike Guo,et al.  Automatic Sleep Stage Scoring with Single-Channel EEG Using Convolutional Neural Networks , 2016, ArXiv.

[36]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[37]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[38]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[40]  M. McHugh Interrater reliability: the kappa statistic , 2012, Biochemia medica.

[41]  Jeffrey M. Hausdorff,et al.  Physionet: Components of a New Research Resource for Complex Physiologic Signals". Circu-lation Vol , 2000 .

[42]  Yiming Yang,et al.  A re-examination of text categorization methods , 1999, SIGIR '99.

[43]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[44]  S. Hochreiter,et al.  Long Short-Term Memory , 1997, Neural Computation.

[45]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[46]  E. Wolpert A Manual of Standardized Terminology, Techniques and Scoring System for Sleep Stages of Human Subjects. , 1969 .

[47]  Tom M. Mitchell,et al.  The Need for Biases in Learning Generalizations , 2007 .

[48]  H. Colten,et al.  Sleep Disorders and Sleep Deprivation: An Unmet Public Health Problem , 2006 .