TEMGNet: Deep Transformer-based Decoding of Upperlimb sEMG for Hand Gestures Recognition

There has been a surge of recent interest in Machine Learning (ML), particularly Deep Neural Network (DNN)-based models, to decode muscle activities from surface Electromyography (sEMG) signals for myoelectric control of neurorobotic systems. DNN-based models, however, require large training sets and, typically, have high structural complexity, i.e., they depend on a large number of trainable parameters. To address these issues, we developed a framework based on the Transformer architecture for processing sEMG signals. We propose a novel Vision Transformer (ViT)-based neural network architecture (referred to as the TEMGNet) to classify and recognize upperlimb hand gestures from sEMG to be used for myocontrol of prostheses. The proposed TEMGNet architecture is trained with a small dataset without the need for pre-training or finetuning. To evaluate the efficacy, following the recent literature, the second subset (exercise B) of the NinaPro DB2 dataset was utilized, where the proposed TEMGNet framework achieved a recognition accuracy of 82.93% and 82.05% for window sizes of 300ms and 200ms, respectively, outperforming its state-of-the-art counterparts. Moreover, the proposed TEMGNet framework is superior in terms of structural capacity while having seven times fewer trainable parameters. These characteristics and the high performance make DNN-based models promising approaches for myoelectric control of neurorobots.

[1]  Manfredo Atzori,et al.  Deep Learning with Convolutional Neural Networks Applied to Electromyography Data: A Resource for the Classification of Movements for Prosthetic Hands , 2016, Front. Neurorobot..

[2]  Vladlen Koltun,et al.  An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling , 2018, ArXiv.

[3]  Manfredo Atzori,et al.  Electromyography data for non-invasive naturally-controlled robotic hand prostheses , 2014, Scientific Data.

[4]  Wenwu Wang,et al.  Low-Dimensional Denoising Embedding Transformer for ECG Classification , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5]  Bin Wang,et al.  Arrhythmia Classification with Heartbeat-Aware Transformer , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6]  Yongkang Wong,et al.  A novel attention-based hybrid CNN-RNN architecture for sEMG-based gesture recognition , 2018, PloS one.

[7]  Georg Heigold,et al.  An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2021, ICLR.

[8]  Weidong Geng,et al.  Gesture recognition by instantaneous surface EMG images , 2016, Scientific Reports.

[9]  Manfredo Atzori,et al.  Movement Error Rate for Evaluation of Machine Learning Methods for sEMG-Based Hand Movement Classification , 2014, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[10]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[11]  Amir Asif,et al.  Few-Shot Learning for Decoding Surface Electromyography for Hand Gesture Recognition , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[12]  Amir Asif,et al.  Semg-Based Hand Gesture Recognition Via Dilated Convolutional Neural Networks , 2019, 2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP).

[13]  Ramiro Casal,et al.  Temporal convolutional networks and transformers for classifying the sleep stage in awake or asleep using pulse oximetry signals , 2021, J. Comput. Sci..

[14]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[15]  Roberto Merletti,et al.  The extraction of neural strategies from the surface EMG. , 2004, Journal of applied physiology.

[16]  Geoffrey E. Hinton,et al.  Layer Normalization , 2016, ArXiv.

[17]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[18]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[19]  Geoffrey Zweig,et al.  Transformer-Based Acoustic Modeling for Hybrid Speech Recognition , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[20]  Ilja Kuzborskij,et al.  Characterization of a Benchmark Database for Myoelectric Movement Classification , 2015, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[21]  Ahmed H Tewfik,et al.  EEG based Continuous Speech Recognition using Transformers , 2020, ArXiv.

[22]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[23]  Mohan S. Kankanhalli,et al.  A multi-stream convolutional neural network for sEMG-based gesture recognition in muscle-computer interface , 2017, Pattern Recognit. Lett..

[24]  Yonghao Song,et al.  Transformer-based Spatial-Temporal Feature Learning for EEG Decoding , 2021, ArXiv.

[25]  Ye Wang,et al.  Translating sEMG signals to continuous hand poses using recurrent neural networks , 2018, 2018 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI).

[26]  Dario Farina,et al.  Myoelectric Control of Artificial Limbs¿Is There a Need to Change Focus? [In the Spotlight] , 2012, IEEE Signal Process. Mag..

[27]  Amir Asif,et al.  Surface EMG-Based Hand Gesture Recognition via Hybrid and Dilated Deep Neural Network Architectures for Neurorobotic Prostheses , 2020, J. Medical Robotics Res..

[28]  Amir Asif,et al.  FS-HGR: Few-Shot Learning for Hand Gesture Recognition via Electromyography , 2020, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[29]  S. Farokh Atashzar,et al.  Temporal Dilation of Deep LSTM for Agile Decoding of sEMG: Application in Prediction of Upper-Limb Motor Intention in NeuroRobotics , 2021, IEEE Robotics and Automation Letters.

[30]  Amir Asif,et al.  XceptionTime: Independent Time-Window Xceptiontime Architecture for Hand Gesture Classification , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).