Real-time prediction learning for the simultaneous actuation of multiple prosthetic joints

Integrating learned predictions into a prosthetic control system promises to enhance multi-joint prosthesis use by amputees. In this article, we present a preliminary study of different cases where it may be beneficial to use a set of temporally extended predictions - learned and maintained in real time - within an engineered or learned prosthesis controller. Our study demonstrates the first successful combination of actor-critic reinforcement learning with real-time prediction learning. We evaluate this new approach to control learning during the myoelectric operation of a robot limb. Our results suggest that the integration of real-time prediction and control learning may speed control policy acquisition, allow unsupervised adaptation in myoelectric controllers, and facilitate synergies in highly actuated limbs. These experiments also show that temporally extended prediction learning enables anticipatory actuation, opening the way for coordinated motion in assistive robotic devices. Our work therefore provides initial evidence that realtime prediction learning is a practical way to support intuitive joint control in increasingly complex prosthetic systems.

[1]  V. Mathiowetz,et al.  Adult norms for the Box and Block Test of manual dexterity. , 1985, The American journal of occupational therapy : official publication of the American Occupational Therapy Association.

[2]  Wolfram Burgard,et al.  The dynamic window approach to collision avoidance , 1997, IEEE Robotics Autom. Mag..

[3]  Richard S. Sutton,et al.  Predictive Representations of State , 2001, NIPS.

[4]  Zoubin Ghahramani,et al.  Perspectives and problems in motor learning , 2001, Trends in Cognitive Sciences.

[5]  R. Johansson,et al.  Prediction Precedes Control in Motor Learning , 2003, Current Biology.

[6]  Richard S. Sutton,et al.  Using Predictive Representations to Improve Generalization in Reinforcement Learning , 2005, IJCAI.

[7]  Steven M. LaValle,et al.  Planning algorithms , 2006 .

[8]  Francesco Lacquaniti,et al.  Control of Fast-Reaching Movements by Muscle Synergy Combinations , 2006, The Journal of Neuroscience.

[9]  T. Kuiken,et al.  Targeted Reinnervation for Transhumeral Amputees: Current Surgical Technique and Update on Results , 2009, Plastic and reconstructive surgery.

[10]  S Micera,et al.  Control of Hand Prostheses Using Peripheral Information , 2010, IEEE Reviews in Biomedical Engineering.

[11]  Pieter Abbeel,et al.  Autonomous Helicopter Aerobatics through Apprenticeship Learning , 2010, Int. J. Robotics Res..

[12]  Panagiotis K. Artemiadis,et al.  EMG-Based Control of a Robot Arm Using Low-Dimensional Embeddings , 2010, IEEE Transactions on Robotics.

[13]  Farbod Fahimi,et al.  Online human training of a myoelectric prosthesis controller via actor-critic reinforcement learning , 2011, 2011 IEEE International Conference on Rehabilitation Robotics.

[14]  Patrick M. Pilarski,et al.  Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.

[15]  T Walley Williams,et al.  Progress on stabilizing and controlling powered upper-limb prostheses. , 2011, Journal of rehabilitation research and development.

[16]  Jeffrey M. Zacks,et al.  Prediction Error Associated with the Perceptual Segmentation of Naturalistic Events , 2011, Journal of Cognitive Neuroscience.

[17]  Stefano Stramigioli,et al.  Myoelectric forearm prostheses: state of the art from a user-centered perspective. , 2011, Journal of rehabilitation research and development.

[18]  Erik Scheme,et al.  Electromyogram pattern recognition for control of powered upper-limb prostheses: state of the art and challenges for clinical use. , 2011, Journal of rehabilitation research and development.

[19]  Joris M. Lambrecht,et al.  Electromyogram-based neural network control of transhumeral prostheses. , 2011, Journal of rehabilitation research and development.

[20]  Byron Boots,et al.  An Online Spectral Learning Algorithm for Partially Observable Nonlinear Dynamical Systems , 2011, AAAI.

[21]  Stuart D. Harshbarger,et al.  An Overview of the Developmental Process for the Modular Prosthetic Limb , 2011 .

[22]  Oliver Kroemer,et al.  Learning to select and generalize striking movements in robot table tennis , 2012, AAAI Fall Symposium: Robots Learning Interactively from Human Teachers.

[23]  R. S. Sutton,et al.  Dynamic switching and real-time machine learning for improved human control of assistive biomedical robots , 2012, 2012 4th IEEE RAS & EMBS International Conference on Biomedical Robotics and Biomechatronics (BioRob).

[24]  L. Resnik,et al.  Advanced upper limb prosthetic devices: implications for upper limb prosthetic rehabilitation. , 2012, Archives of physical medicine and rehabilitation.

[25]  TaeChoong Chung,et al.  Learning via human feedback in continuous state and action spaces , 2013, Applied Intelligence.

[26]  Patrick M. Pilarski,et al.  Model-Free reinforcement learning with continuous action in practice , 2012, 2012 American Control Conference (ACC).

[27]  Patrick M. Pilarski,et al.  Acquiring a broad range of empirical knowledge in real time by temporal-difference learning , 2012, 2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[28]  Peter Stone,et al.  Learning non-myopically from human-generated reward , 2013, IUI '13.

[29]  Patrick M. Pilarski,et al.  Adaptive artificial limbs: a real-time approach to prediction and anticipation , 2013, IEEE Robotics & Automation Magazine.

[30]  A. Schwartz,et al.  High-performance neuroprosthetic control by an individual with tetraplegia , 2013, The Lancet.

[31]  Richard S. Sutton,et al.  Multi-timescale nexting in a reinforcement learning robot , 2011, Adapt. Behav..