Experimental Learning of a Lift-Maximizing Central Pattern Generator for a Flapping Robotic Wing

In this work, we present an application of a policy gradient algorithm to a real-time robotic learning problem, where the goal is to maximize the average lift generation of a dynamically scaled robotic wing at a constant Reynolds number (Re). Compared to our previous work, the merit of this work is two-fold. First, a central pattern generator (CPG) model was used as the motion controller, which provided a smooth generation and transition of rhythmic wing motion patterns while the CPG was being updated by the policy gradient, thereby accelerating the sample generation and reducing the total learning time. Second, the kinematics included three degrees of freedom (stroke, deviation, pitching) and were also free of half-stroke symmetry constraint, together they yielded a larger kinematic space which later explored by the policy gradient to maximize the lift generation. The learned wing kinematics used the full range of stroke and deviation to maximize the lift generation, implying that the wing trajectories with larger disk area and lower frequencies were preferred for high lift generation at constant Re. Furthermore, the wing pitching amplitude converged to values between $45^{\circ}-49^{\circ}$ regardless of what the other parameters were. Notably, the learning agent was able to find two locally optimal wing motion patterns, which had distinct shapes of wing trajectory but generated similar cycle-averaged lift.

[1]  Auke Jan Ijspeert,et al.  Towards dynamic trot gait locomotion: Design, control, and experiments with Cheetah-cub, a compliant quadruped robot , 2013, Int. J. Robotics Res..

[2]  Danwei Wang,et al.  Central Pattern Generator Inspired Control for Adaptive Walking of Biped Robots , 2013, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[3]  Hans-Paul Schwefel,et al.  Evolution strategies – A comprehensive introduction , 2002, Natural Computing.

[4]  Mohammad Ghanaatpishe,et al.  Hovering efficiency comparison of rotary and flapping flight for rigid rectangular wings via dimensionless multi-objective optimization , 2018, Bioinspiration & biomimetics.

[5]  Gordon J. Berman,et al.  Energy-minimizing kinematics in hovering insect flight , 2007, Journal of Fluid Mechanics.

[6]  R. J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[7]  Tom Schaul,et al.  Natural Evolution Strategies , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[8]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[9]  Norbert J. Fliege Multirate Digital Signal Processing , 1994 .

[10]  Tom Schaul,et al.  Stochastic search using the natural gradient , 2009, ICML '09.

[11]  Wenyuan Chen,et al.  Wing geometry and kinematic parameters optimization of flapping wing hovering flight for minimum energy , 2017 .

[12]  Mostafa R A Nabawy,et al.  On the quasi-steady aerodynamics of normal hovering flight part I: the induced power factor , 2014, Journal of The Royal Society Interface.

[13]  S. N. Fry,et al.  The aerodynamics of hovering flight in Drosophila , 2005, Journal of Experimental Biology.

[14]  Petros Koumoutsakos,et al.  Efficient collective swimming by harnessing vortices through deep reinforcement learning , 2018, Proceedings of the National Academy of Sciences.

[15]  Haithem E. Taha,et al.  Wing Kinematics Optimization for Hovering Micro Air Vehicles Using Calculus of Variation , 2013 .

[16]  Domenico Campolo,et al.  Liftoff of a Motor-Driven, Flapping-Wing Microaerial Vehicle Capable of Resonance , 2014, IEEE Transactions on Robotics.

[17]  Frank Sehnke,et al.  Parameter-exploring policy gradients , 2010, Neural Networks.

[18]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[19]  Jun Morimoto,et al.  Efficient Sample Reuse in Policy Gradients with Parameter-Based Exploration , 2012, Neural Computation.

[20]  Robert J. Wood,et al.  The First Takeoff of a Biologically Inspired At-Scale Robotic Insect , 2008, IEEE Transactions on Robotics.

[21]  Lawrence R. Rabiner,et al.  Multirate Digital Signal Processing , 2019 .

[22]  A. Gehrke,et al.  Genetic Algorithm Based Optimization of Wing Rotation in Hover , 2018, Fluids.

[23]  Xi Chen,et al.  Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.

[24]  Bo Cheng,et al.  Unsteady aerodynamics of a pitching-flapping-perturbed revolving wing at low Reynolds number , 2018 .

[25]  M. Dickinson,et al.  Wing rotation and the aerodynamic basis of insect flight. , 1999, Science.

[26]  Joseph Yan,et al.  A Reinforcement Learning Approach to Lift Generation in Flapping MAVs: Experimental Results , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[27]  Yoshiki Kuramoto,et al.  Self-entrainment of a population of coupled non-linear oscillators , 1975 .

[28]  K. B. Lua,et al.  Optimization of Simple and Complex Pitching Motions for Flapping Wings in Hover , 2018, AIAA Journal.

[29]  Henry Won,et al.  Development of the Nano Hummingbird: A Tailless Flapping Wing Micro Air Vehicle , 2012 .

[30]  Auke Jan Ijspeert,et al.  Controlling swimming and crawling in a fish robot using a central pattern generator , 2008, Auton. Robots.

[31]  Auke Jan Ijspeert,et al.  Online Optimization of Swimming and Crawling in an Amphibious Snake Robot , 2008, IEEE Transactions on Robotics.

[32]  Frank Sehnke,et al.  Policy Gradients with Parameter-Based Exploration for Control , 2008, ICANN.

[33]  R.G. Polcawich,et al.  PZT MEMS Actuated Flapping Wings for Insect-Inspired Robotics , 2009, 2009 IEEE 22nd International Conference on Micro Electro Mechanical Systems.

[34]  A. Ijspeert,et al.  From Swimming to Walking with a Salamander Robot Driven by a Spinal Cord Model , 2007, Science.

[35]  Xinyan Deng,et al.  Three-dimensional vortex wake structure of flapping wings in hovering flight , 2013, Journal of The Royal Society Interface.

[36]  Jan Peters,et al.  A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.

[37]  Bret Stanford,et al.  Kinematic Optimization of Insect Flight for Minimum Mechanical Power , 2010 .

[38]  Auke Jan Ijspeert,et al.  Central pattern generators for locomotion control in animals and robots: A review , 2008, Neural Networks.

[39]  Jian Zhang,et al.  Instantaneous wing kinematics tracking and force control of a high-frequency flapping wing insect MAV , 2016 .

[40]  Gang Niu,et al.  Analysis and Improvement of Policy Gradient Estimation , 2011, NIPS.

[41]  Stefan Schaal,et al.  Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[42]  Qingguo Wang,et al.  Locomotion Learning for an Anguilliform Robotic Fish Using Central Pattern Generator Approach , 2014, IEEE Transactions on Industrial Electronics.

[43]  Muhammad R. Hajj,et al.  Effects of aerodynamic modeling on the optimal wing kinematics for hovering MAVs , 2015 .

[44]  Long Chen,et al.  Real-Time Learning of Efficient Lift Generation on a Dynamically Scaled Flapping Wing Using Policy Search , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).