DLVS: Time Series Architecture for Image-Based Visual Servoing

A novel deep learning-based visual servoing architecture “DLVS” is proposed for control of an unmanned aerial vehicle (UAV) capable of quasi-stationary flight with a camera mounted under the vehicle to track a target consisting of a finite set of stationary points lying in a plane. Current Deep Learning and Reinforcement Learning (RL) based end-to-end servoing approaches rely on training convolutional neural networks using color images with known camera poses to learn the visual features in the environment suitable for servoing tasks. This approach limits the application of the network to available environments where the dataset was collected. Moreover, we cannot deploy such networks on the low-power computers present onboard the UAV. The proposed solution employs a time series architecture to learn temporal data from sequential values to output the control cues to the flight controller. The low computational complexity and flexibility of the DLVS architecture ensure real-time onboard tracking for virtually any target. The algorithm was thoroughly validated in real-life environments and outperformed the current state-of-the-art in terms of time efficiency and accuracy.

[1]  Qian Li,et al.  3D-UNet-LSTM: A Deep Learning-Based Radar Echo Extrapolation Model for Convective Nowcasting , 2023, Remote. Sens..

[2]  Benny P. L. Lo,et al.  An LSTM-based Bilateral Active Estimation Model for Robotic Teleoperation with Varying Time Delay , 2022, 2022 International Conference on Advanced Robotics and Mechatronics (ICARM).

[3]  Chhavi Dhiman,et al.  DepthNet: A Monocular Depth Estimation Framework , 2021, 2021 International Conference on Engineering and Emerging Technologies (ICEET).

[4]  M. Karthi,et al.  Evolution of YOLO-V5 Algorithm for Object Detection: Automated Detection of Library Books and Performace validation of Dataset , 2021, 2021 International Conference on Innovative Computing, Intelligent Communication and Smart Electrical Systems (ICSES).

[5]  V. Pattabiraman,et al.  Comparative analysis of deep learning image detection algorithms , 2021, Journal of Big Data.

[6]  Jan Adamowski,et al.  Coupling a hybrid CNN-LSTM deep learning model with a Boundary Corrected Maximal Overlap Discrete Wavelet Transform for multiscale Lake water level forecasting , 2021, Journal of Hydrology.

[7]  V. Grassi,et al.  Real-Time Deep Learning Approach to Visual Servo Control and Grasp Detection for Autonomous Robotic Manipulation , 2020, Robotics Auton. Syst..

[8]  Jie Zhang,et al.  A new image-based visual servoing method with velocity direction control , 2020, J. Frankl. Inst..

[9]  Peter I. Corke,et al.  Training Deep Neural Networks for Visual Servoing , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[10]  Houshang Darabi,et al.  Multivariate LSTM-FCNs for Time Series Classification , 2018, Neural Networks.

[11]  Jae Hyun Lim,et al.  Geometric GAN , 2017, ArXiv.

[12]  Hugh H. T. Liu,et al.  Position-Based Visual Servoing for Target Tracking by a Quadrotor UAV , 2016 .

[13]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[14]  Etienne Colle,et al.  Visual servoing based mobile robot navigation able to deal with complete target loss , 2013, 2013 18th International Conference on Methods & Models in Automation & Robotics (MMAR).

[15]  Navid Shahriari,et al.  Robotic visual servoing of moving targets , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[16]  Giuseppe Oriolo,et al.  Dynamic IBVS control of an underactuated UAV , 2012, 2012 IEEE International Conference on Robotics and Biomimetics (ROBIO).

[17]  Mark Whitty,et al.  Robotics, Vision and Control. Fundamental Algorithms in MATLAB , 2012 .

[18]  François Chaumette,et al.  Visual servo control. II. Advanced approaches [Tutorial] , 2007, IEEE Robotics & Automation Magazine.

[19]  François Chaumette,et al.  Visual servo control. I. Basic approaches , 2006, IEEE Robotics & Automation Magazine.

[20]  Andrew Howard,et al.  Design and use paradigms for Gazebo, an open-source multi-robot simulator , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[21]  Philippe Martinet,et al.  Position based visual servoing: keeping the object in the field of vision , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[22]  S. Hochreiter,et al.  Long Short-Term Memory , 1997, Neural Computation.

[23]  Peter I. Corke,et al.  A tutorial on visual servo control , 1996, IEEE Trans. Robotics Autom..

[24]  Tsair-Fwu Lee,et al.  Video Object Tracking with Heuristic Optimization Methods , 2018 .

[25]  Anurag Sai Vempati,et al.  Quadrotor: Design, Control and Vision Based Localization , 2014 .

[26]  Marcelo Ricardo Stemmer,et al.  Abnormal Motion Analysis for Tracking-Based Approaches Using Region-Based Method with Mobile Grid , 2014 .

[27]  Sumana Gupta,et al.  Video Stabilization, Camera Motion Pattern Recognition and Motion Tracking Using Spatiotemporal Regularity Flow , 2014 .

[28]  Sreela Sasi,et al.  Robust Algorithm for Object Detection and Tracking in a Dynamic Scene , 2014 .

[29]  E. Rivin,et al.  Denavit, J. & Hartenberg, R. (1964). "Kinematic Synthesis of Linkages". Estados Unidos de América: McGraw-Hill, Inc. Corke, P. (2011). "Robotics, Vision and Control: Fundamental Algorithms in MATLAB". , 2011 .

[30]  Morgan Quigley,et al.  ROS: an open-source Robot Operating System , 2009, ICRA 2009.