Towards a Robust Aerial Cinematography Platform: Localizing and Tracking Moving Targets in Unstructured Environments

The use of drones for aerial cinematography has revolutionized several applications and industries that require live and dynamic camera viewpoints such as entertainment, sports, and security. However, safely controlling a drone while filming a moving target usually requires multiple expert human operators; hence the need for an autonomous cinematographer. Current approaches have severe real-life limitations such as requiring fully scripted scenes, high-precision motion-capture systems or GPS tags to localize targets, and prior maps of the environment to avoid obstacles and plan for occlusion.In this work, we overcome such limitations and propose a complete system for aerial cinematography that combines: (1) a vision-based algorithm for target localization; (2) a real-time incremental 3D signed-distance map algorithm for occlusion and safety computation; and (3) a real-time camera motion planner that optimizes smoothness, collisions, occlusions and artistic guidelines. We evaluate robustness and real-time performance in series of field experiments and simulations by tracking dynamic targets moving through unknown, unstructured environments. Finally, we verify that despite removing previous limitations, our system achieves state-of-the-art performance.

[1]  Yaser Sheikh,et al.  OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Peng Wang,et al.  Appearance based pedestrians' head pose and body orientation estimation using deep learning , 2018, Neurocomputing.

[3]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[4]  Antoni B. Chan,et al.  3D Human Pose Estimation from Monocular Images with Deep Convolutional Neural Network , 2014, ACCV.

[5]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Quentin Galvane,et al.  Automated Cinematography with Unmanned Aerial Vehicles , 2016, WICED@Eurographics.

[7]  Rui Caseiro,et al.  High-Speed Tracking with Kernelized Correlation Filters , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Vijay Kumar,et al.  Information-Theoretic Planning with Trajectory Optimization for Dense 3D Mapping , 2015, Robotics: Science and Systems.

[9]  Sebastian Scherer,et al.  Autonomous drone cinematographer: Using artistic principles to create smooth, safe, occlusion-free trajectories for aerial filming , 2018, ISER.

[10]  Siddhartha S. Srinivasa,et al.  Chisel: Real Time Large Scale 3D Reconstruction Onboard a Mobile Device using Spatially Hashed Signed Distance Fields , 2015, Robotics: Science and Systems.

[11]  Pat Hanrahan,et al.  Generating dynamically feasible trajectories for quadrotor cameras , 2016, ACM Trans. Graph..

[12]  Pat Hanrahan,et al.  An interactive tool for designing quadrotor camera shots , 2015, ACM Trans. Graph..

[13]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[14]  Pat Hanrahan,et al.  Towards a Drone Cinematographer: Guiding Quadrotor Cameras using Visual Composition Principles , 2016, ArXiv.

[15]  Christian Szegedy,et al.  DeepPose: Human Pose Estimation via Deep Neural Networks , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Patrick Olivier,et al.  Camera Control in Computer Graphics , 2008, Comput. Graph. Forum.

[17]  Xiangyu Zhang,et al.  ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18]  Mohit Shridhar,et al.  XPose: Reinventing User Interaction with Flying Cameras , 2017, Robotics: Science and Systems.

[19]  Michael Gleicher,et al.  Through-the-lens camera control , 1992, SIGGRAPH.

[20]  Roland Siegwart,et al.  Voxblox: Incremental 3D Euclidean Signed Distance Fields for on-board MAV planning , 2016, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[21]  Davide Scaramuzza,et al.  An information gain formulation for active volumetric 3D reconstruction , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[22]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[23]  Marc Christie,et al.  The director's lens: an intelligent assistant for virtual cinematography , 2011, ACM Multimedia.

[24]  Steven M. Drucker,et al.  Intelligent Camera Control in a Virtual Environment , 1994 .

[25]  Sebastian Scherer,et al.  Sparse Tangential Network (SPARTAN): Motion planning for micro aerial vehicles , 2013, 2013 IEEE International Conference on Robotics and Automation.

[26]  Alexander Domahidi,et al.  Real-time planning for automated multi-view drone cinematography , 2017, ACM Trans. Graph..

[27]  D. Arijon,et al.  Grammar of Film Language , 1976 .

[28]  Dani Lischinski,et al.  Creating and chaining camera moves for quadrotor videography , 2018, ACM Trans. Graph..

[29]  Sebastian Scherer,et al.  Improved Generalization of Heading Direction Estimation for Aerial Filming Using Semi-Supervised Regression , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[30]  Xin Yang,et al.  ACT: An Autonomous Drone Cinematography System for Action Scenes , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[31]  Kwang-Ting Cheng,et al.  Through-the-Lens Drone Filming , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[32]  Otmar Hilliges,et al.  Optimizing for aesthetically pleasing quadrotor camera motion , 2018, ACM Trans. Graph..

[33]  Siddhartha S. Srinivasa,et al.  CHOMP: Covariant Hamiltonian optimization for motion planning , 2013, Int. J. Robotics Res..

[34]  Marc Christie,et al.  Directing Cinematographic Drones , 2017, ACM Trans. Graph..

[35]  Otmar Hilliges,et al.  Airways: Optimization-Based Planning of Quadrotor Trajectories according to High-Level User Goals , 2016, CHI.

[36]  Marc Christie,et al.  Intuitive and efficient camera control with the toric space , 2015, ACM Trans. Graph..

[37]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.