AJILE Movement Prediction: Multimodal Deep Learning for Natural Human Neural Recordings and Video

Developing useful interfaces between brains and machines is a grand challenge of neuroengineering. An effective interface has the capacity to not only interpret neural signals, but predict the intentions of the human to perform an action in the near future; prediction is made even more challenging outside well-controlled laboratory experiments. This paper describes our approach to detect and to predict natural human arm movements in the future, a key challenge in brain computer interfacing that has never before been attempted. We introduce the novel Annotated Joints in Long-term ECoG (AJILE) dataset; AJILE includes automatically annotated poses of 7 upper body joints for four human subjects over 670 total hours (more than 72 million frames), along with the corresponding simultaneously acquired intracranial neural recordings. The size and scope of AJILE greatly exceeds all previous datasets with movements and electrocorticography (ECoG), making it possible to take a deep learning approach to movement prediction. We propose a multimodal model that combines deep convolutional neural networks (CNN) with long short-term memory (LSTM) blocks, leveraging both ECoG and video modalities. We demonstrate that our models are able to detect movements and predict future movements up to 800 msec before movement initiation. Further, our multimodal movement prediction models exhibit resilience to simulated ablation of input neural signals. We believe a multimodal approach to natural neural decoding that takes context into account is critical in advancing bioelectronic technologies and human neuroscience.

[1]  J. A. Wilson,et al.  Two-dimensional movement control using electrocorticographic signals in humans , 2008, Journal of neural engineering.

[2]  Zoran Nenadic,et al.  State and trajectory decoding of upper extremity movements from electrocorticogram , 2013, 2013 6th International IEEE/EMBS Conference on Neural Engineering (NER).

[3]  E. Fetz,et al.  Decoupling the Cortical Power Spectrum Reveals Real-Time Representation of Individual Finger Movements in Humans , 2009, The Journal of Neuroscience.

[4]  Nicholas P. Szrama,et al.  Using the electrocorticographic speech network to control a brain–computer interface in humans , 2011, Journal of neural engineering.

[5]  Mohammad Dastjerdi,et al.  Numerical processing in the human parietal cortex during experimental and natural conditions , 2013, Nature Communications.

[6]  Andreas Schulze-Bonhage,et al.  Signal quality of simultaneously recorded invasive and non-invasive EEG , 2009, NeuroImage.

[7]  Wolfram Burgard,et al.  Deep learning with convolutional neural networks for brain mapping and decoding of movement-related information from the human EEG , 2017, ArXiv.

[8]  Rajesh P. N. Rao,et al.  Cortical activity during motor execution, motor imagery, and imagery-based online feedback , 2010, Proceedings of the National Academy of Sciences.

[9]  E. Fetz,et al.  Correlations between the same motor cortex cells and arm muscles during a trained task, free behavior, and natural sleep in the macaque monkey. , 2007, Journal of neurophysiology.

[11]  Andrew Zisserman,et al.  Flowing ConvNets for Human Pose Estimation in Videos , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[12]  J. Wolpaw,et al.  Brain-Computer Interfaces: Principles and Practice , 2012 .

[13]  Rajesh P. N. Rao Brain-Computer Interfacing: An Introduction , 2010 .

[14]  Bernhard Schölkopf,et al.  Methods Towards Invasive Human Brain Computer Interfaces , 2004, NIPS.

[15]  Antonio Torralba,et al.  SoundNet: Learning Sound Representations from Unlabeled Video , 2016, NIPS.

[16]  Andreas Schulze-Bonhage,et al.  From speech to thought: the neuronal basis of cognitive units in non-experimental, real-life communication investigated using ECoG , 2014, Front. Hum. Neurosci..

[17]  Nitish V. Thakor,et al.  Demonstration of a Semi-Autonomous Hybrid Brain–Machine Interface Using Human Intracranial EEG, Eye Tracking, and Computer Vision to Control a Robotic Upper Limb Prosthetic , 2014, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[18]  Rajesh P. N. Rao,et al.  Generalized Features for Electrocorticographic BCIs , 2008, IEEE Transactions on Biomedical Engineering.

[19]  Chuan Wang,et al.  Look, Listen and Learn - A Multimodal LSTM for Speaker Identification , 2016, AAAI.

[20]  Christian Szegedy,et al.  DeepPose: Human Pose Estimation via Deep Neural Networks , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Robin C. Ashmore,et al.  An Electrocorticographic Brain Interface in an Individual with Tetraplegia , 2013, PloS one.

[22]  Johanna Ruescher,et al.  Somatotopic mapping of natural upper- and lower-extremity movements and speech production with high gamma electrocorticography , 2013, NeuroImage.

[23]  Nitish V. Thakor,et al.  Simultaneous Neural Control of Simple Reaching and Grasping With the Modular Prosthetic Limb Using Intracranial EEG , 2014, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[24]  Juhan Nam,et al.  Multimodal Deep Learning , 2011, ICML.

[25]  Yasuharu Koike,et al.  Prediction of Three-Dimensional Arm Trajectories Based on ECoG Signals Recorded from Human Sensorimotor Cortex , 2013, PloS one.

[26]  Qiang Ji,et al.  Deep Feature Learning Using Target Priors with Applications in ECoG Signal Decoding for BCI , 2013, IJCAI.

[27]  Nicolas Y. Masse,et al.  Reach and grasp by people with tetraplegia using a neurally controlled robotic arm , 2012, Nature.

[28]  Wolfram Burgard,et al.  Deep learning with convolutional neural networks for EEG decoding and visualization , 2017, Human brain mapping.

[29]  C.E. Elger,et al.  A CNN-based synchronization analysis for epileptic seizure prediction: Inter- and intraindividual generalization properties , 2008, 2008 11th International Workshop on Cellular Neural Networks and Their Applications.

[30]  Rajesh P. N. Rao,et al.  Unsupervised Decoding of Long-Term, Naturalistic Human Neural Recordings with Automated Video and Audio Annotations , 2015, Front. Hum. Neurosci..

[31]  Qiang Ji,et al.  Decoding Finger Flexion from Electrocorticographic Signals Using a Sparse Gaussian Process , 2010, 2010 20th International Conference on Pattern Recognition.

[32]  Daniel Moran,et al.  Evolution of brain–computer interface: action potentials, local field potentials and electrocorticograms , 2010, Current Opinion in Neurobiology.

[33]  Daryl R Kipke,et al.  Complex impedance spectroscopy for monitoring tissue responses to inserted neural implants , 2007, Journal of neural engineering.

[34]  Andreas Schulze-Bonhage,et al.  Decoding natural grasp types from human ECoG , 2012, NeuroImage.

[35]  Ronald Poppe,et al.  Vision-based human motion analysis: An overview , 2007, Comput. Vis. Image Underst..

[36]  Antonio Jimeno-Yepes,et al.  Decoding EEG and LFP signals using deep learning: heading TrueNorth , 2016, Conf. Computing Frontiers.

[37]  Vikash Gilja,et al.  Neural correlates to automatic behavior estimations from RGB-D video in epilepsy unit , 2016, 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[38]  H. Yokoi,et al.  Real-time control of a prosthetic hand using human electrocorticography signals. , 2011, Journal of neurosurgery.

[39]  Alexandra Branzan Albu,et al.  Automated Analysis of Wild Fish Behavior in a Natural Habitat , 2015, EMR@ICMR.

[40]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).