Light-Weight Object Detection and Decision Making via Approximate Computing in Resource-Constrained Mobile Robots

Most of the current solutions for autonomous flights in indoor environments rely on purely geometric maps (e.g., point clouds). There has been, however, a growing interest in supplementing such maps with semantic information (e.g., object detections) using computer vision algorithms. Unfortunately, there is a disconnect between the relatively heavy computational requirements of these computer vision solutions, and the limited computation capacity available on mobile autonomous platforms. In this paper, we propose to bridge this gap with a novel Markov Decision Process framework that adapts the parameters of the vision algorithms to the incoming video data rather than fixing them a priori. As a concrete example, we test our framework on a object detection and tracking task, showing significant benefits in terms of energy consumption without considerable loss in accuracy, using a combination of publicly available and novel datasets.

[1]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[2]  Lucas Paletta,et al.  Q-learning of sequential attention for visual object recognition from informative local descriptors , 2005, ICML.

[3]  Martial Hebert,et al.  Activity Forecasting , 2012, ECCV.

[4]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5]  Marilena Vendittelli,et al.  Vision-based maze navigation for humanoid robots , 2017, Auton. Robots.

[6]  Luc Van Gool,et al.  Traffic sign recognition — How far are we from the solution? , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[7]  Mayank Singh,et al.  Cloud-Based Collaborative 3D Mapping in Real-Time With Low-Cost Robots , 2015, IEEE Transactions on Automation Science and Engineering.

[8]  Thomas Deselaers,et al.  Measuring the Objectness of Image Windows , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Trevor Darrell,et al.  Anytime Recognition of Objects and Scenes , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Yuanzhen Li,et al.  Feature congestion: a measure of display clutter , 2005, CHI.

[11]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[12]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[13]  Lino Marques,et al.  Computation Sharing in Distributed Robotic Systems: A Case Study on SLAM , 2015, IEEE Transactions on Automation Science and Engineering.

[14]  Silvio Savarese,et al.  Learning to Track: Online Multi-object Tracking by Decision Making , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[15]  C. Lawrence Zitnick,et al.  Edge Boxes: Locating Object Proposals from Edges , 2014, ECCV.

[16]  Raffaello D'Andrea,et al.  RoboEarth: connecting robots worldwide , 2009, ICIS.

[17]  Raffaello D'Andrea,et al.  Rapyuta: The RoboEarth Cloud Engine , 2013, 2013 IEEE International Conference on Robotics and Automation.