Autonomous UAV Cinematography

The emerging field of autonomous UAV cinematography is examined through a tutorial for non-experts, which also presents the required underlying technologies and connections with different UAV application domains. Current industry practices are formalized by presenting a UAV shot-type taxonomy composed of framing shot types, single-UAV camera motion types, and multiple-UAV camera motion types. Visually pleasing combinations of framing shot types and camera motion types are identified, while the presented camera motion types are modeled geometrically and graded into distinct energy consumption classes and required technology complexity levels for autonomous capture. Two specific strategies are prescribed, namely focal length compensation and multidrone compensation, for partially overcoming a number of issues arising in UAV live outdoor event coverage, deemed as the most complex UAV cinematography scenario. Finally, the shot types compatible with each compensation strategy are explicitly identified. Overall, this tutorial both familiarizes readers coming from different backgrounds with the topic in a structured manner and lays necessary groundwork for future advancements.

[1]  Anastasios Tefas,et al.  Graph Embedded Convolutional Neural Networks in Human Crowd Detection for Drone Flight Safety , 2019, IEEE Transactions on Emerging Topics in Computational Intelligence.

[2]  Luigi Cinque,et al.  A UAV Video Dataset for Mosaicking and Change Detection From Low-Altitude Flights , 2020, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[3]  Anastasios Tefas,et al.  Computational UAV Cinematography for Intelligent Shooting Based on Semantic Visual Analysis , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[4]  Ioannis Pitas,et al.  Shot Type Feasibility in Autonomous UAV Cinematography , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5]  Ioannis Pitas,et al.  High-Level Multiple-UAV Cinematography Tools for Covering Outdoor Events , 2019, IEEE Transactions on Broadcasting.

[6]  Anastasios Tefas,et al.  Semantic Map Annotation Through UAV Video Analysis Using Deep Learning Models in ROS , 2018, MMM.

[7]  Khaled A. Harras,et al.  On Realistic Target Coverage by Autonomous Drones , 2017, ACM Trans. Sens. Networks.

[8]  Anastasios Tefas,et al.  Autonomous Unmanned Aerial Vehicles Filming In Dynamic Unstructured Outdoor Environments [Applications Corner] , 2019, IEEE Signal Processing Magazine.

[9]  Anastasios Tefas,et al.  Convolutional Neural Networks for Visual Information Analysis with Limited Computing Resources , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[10]  Ioannis Pitas,et al.  UAV Cinematography Constraints Imposed by Visual Target Tracking , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[11]  Hyoung Il Son,et al.  Multiple UAV Systems for Agricultural Applications: Control, Implementation, and Evaluation , 2018, Electronics.

[12]  Ioannis Pitas,et al.  Challenges in Autonomous UAV Cinematography: An Overview , 2018, 2018 IEEE International Conference on Multimedia and Expo (ICME).

[13]  K. Pitstick,et al.  Applying video summarization to aerial surveillance , 2018, Defense + Security.

[14]  Anastasios Tefas,et al.  Fast Deep Convolutional Face Detection in the Wild Exploiting Hard Sample Mining , 2017, Big Data Res..

[15]  M. Montagnuolo,et al.  THE FUTURE OF MEDIA PRODUCTION THROUGH MULTI-DRONES ’ EYES , 2018 .

[16]  Rita Cunha,et al.  A Multidrone Approach for Autonomous Cinematography Planning , 2017, ROBOT.

[17]  Ioannis Pitas,et al.  2D visual tracking for sports UAV cinematography applications , 2017, 2017 IEEE Global Conference on Signal and Information Processing (GlobalSIP).

[18]  Andrew Zisserman,et al.  Detect to Track and Track to Detect , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[19]  Bernard Ghanem,et al.  Teaching UAVs to Race Using UE4Sim , 2017, ArXiv.

[20]  Abhinav Gupta,et al.  Temporal Dynamic Graph LSTM for Action-Driven Video Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[21]  Anastasios Tefas,et al.  Concept detection and face pose estimation using lightweight convolutional neural networks for steering drone video shooting , 2017, 2017 25th European Signal Processing Conference (EUSIPCO).

[22]  Anastasios Tefas,et al.  Human crowd detection for drone flight safety using convolutional neural networks , 2017, 2017 25th European Signal Processing Conference (EUSIPCO).

[23]  Anastasios Tefas,et al.  Lightweight two-stream convolutional face detection , 2017, 2017 25th European Signal Processing Conference (EUSIPCO).

[24]  Alexander Domahidi,et al.  Real-time planning for automated multi-view drone cinematography , 2017, ACM Trans. Graph..

[25]  Tinne Tuytelaars,et al.  How hard is it to cross the room? - Training (Recurrent) Neural Networks to steer a UAV , 2017, ArXiv.

[26]  Alexander Domahidi,et al.  Real-Time Motion Planning for Aerial Videography With Real-Time With Dynamic Obstacle Avoidance and Viewpoint Optimization , 2017, IEEE Robotics and Automation Letters.

[27]  Weijia Li,et al.  Deep Learning Based Oil Palm Tree Detection and Counting for High-Resolution Remote Sensing Images , 2016, Remote. Sens..

[28]  Sergey Levine,et al.  (CAD)$^2$RL: Real Single-Image Flight without a Single Real Image , 2016, Robotics: Science and Systems.

[29]  Juan D. Tardós,et al.  ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras , 2016, IEEE Transactions on Robotics.

[30]  Juan D. Tardós,et al.  Visual-Inertial Monocular SLAM With Map Reuse , 2016, IEEE Robotics and Automation Letters.

[31]  Chang S Nam,et al.  A meta-analysis of human-system interfaces in unmanned aerial vehicle (UAV) swarm management. , 2017, Applied ergonomics.

[32]  Anastasios Tefas,et al.  Face detection based on deep convolutional neural networks exploiting incremental facial part learning , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[33]  Rita Cunha,et al.  Landing of a Quadrotor on a Moving Target Using Dynamic Image-Based Visual Servo Control , 2016, IEEE Transactions on Robotics.

[34]  Djamel Eddine Chouaib Belkhiat,et al.  Multisensor Attitude Estimation : Fundamental Concepts and Applications , 2016 .

[35]  Bernard Ghanem,et al.  A Benchmark and Simulator for UAV Tracking , 2016, ECCV.

[36]  Pat Hanrahan,et al.  Towards a Drone Cinematographer: Guiding Quadrotor Cameras using Visual Composition Principles , 2016, ArXiv.

[37]  Evsen Yanmaz,et al.  Survey on Unmanned Aerial Vehicle Networks for Civil Applications: A Communications Viewpoint , 2016, IEEE Communications Surveys & Tutorials.

[38]  Blain Brown,et al.  Cinematography : theory and practice : imagemaking for cinematographers and directors , 2016 .

[39]  Pat Hanrahan,et al.  Generating dynamically feasible trajectories for quadrotor cameras , 2016, ACM Trans. Graph..

[40]  Yong Jae Lee,et al.  Track and Segment: An Iterative Unsupervised Approach for Video Object Proposals , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Quentin Galvane,et al.  Automated Cinematography with Unmanned Aerial Vehicles , 2016, WICED@Eurographics.

[42]  Otmar Hilliges,et al.  Airways: Optimization-Based Planning of Quadrotor Trajectories according to High-Level User Goals , 2016, CHI.

[43]  Peter I. Corke,et al.  Visual Place Recognition: A Survey , 2016, IEEE Transactions on Robotics.

[44]  Katia P. Sycara,et al.  Human Interaction With Robot Swarms: A Survey , 2016, IEEE Transactions on Human-Machine Systems.

[45]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[46]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Barbara Pfeffer Cinematography Theory And Practice Image Making For Cinematographers And Directors , 2016 .

[48]  Sharath Pankanti,et al.  Automatic Video Content Summarization Using Geospatial Mosaics of Aerial Imagery , 2015, 2015 IEEE International Symposium on Multimedia (ISM).

[49]  Tsuhan Chen,et al.  Deep Neural Network for Real-Time Autonomous Indoor Navigation , 2015, ArXiv.

[50]  Pat Hanrahan,et al.  An interactive tool for designing quadrotor camera shots , 2015, ACM Trans. Graph..

[51]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[52]  Rémi Ronfard,et al.  A Computational Framework for Vertical Video Editing , 2015, WICED@Eurographics.

[53]  Hyo-Sung Ahn,et al.  A survey of multi-agent formation control , 2015, Autom..

[54]  J. M. M. Montiel,et al.  ORB-SLAM: A Versatile and Accurate Monocular SLAM System , 2015, IEEE Transactions on Robotics.

[55]  Paul J. M. Havinga,et al.  A Survey of Online Activity Recognition Using Mobile Phones , 2015, Sensors.

[56]  Zhe Xu,et al.  Feature Learning Based Approach for Weed Classification Using High Resolution Aerial Images from a Digital Camera Mounted on a UAV , 2014, Remote. Sens..

[57]  Anastasios Tefas,et al.  Shot type characterization in 2D and 3D video content , 2014, 2014 IEEE 16th International Workshop on Multimedia Signal Processing (MMSP).

[58]  Yaser Sheikh,et al.  Automatic editing of footage from multiple social cameras , 2014, ACM Trans. Graph..

[59]  Ji Zhang,et al.  LOAM: Lidar Odometry and Mapping in Real-time , 2014, Robotics: Science and Systems.

[60]  Jizhong Xiao,et al.  A literature review of UAV 3D path planning , 2014, Proceeding of the 11th World Congress on Intelligent Control and Automation.

[61]  Guido Morgenthal,et al.  Quality Assessment of Unmanned Aerial Vehicle (UAV) Based Visual Inspection of Structures , 2014 .

[62]  Raphaël Couturier,et al.  Designing Scientific Applications on GPUs , 2013 .

[63]  Daniel Cremers,et al.  FollowMe: Person following and gesture recognition with a quadrocopter , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[64]  Peter Carr,et al.  Hybrid robotic/virtual pan-tilt-zom cameras for autonomous event recording , 2013, ACM Multimedia.

[65]  Edwin K. P. Chong,et al.  UAV Path Planning in a Dynamic Environment via Partially Observable Markov Decision Process , 2013, IEEE Transactions on Aerospace and Electronic Systems.

[66]  Yuhui Shi,et al.  ?Hybrid Particle Swarm Optimization and Genetic Algorithm for Multi-UAV Formation Reconfiguration , 2013, IEEE Computational Intelligence Magazine.

[67]  Sharath Pankanti,et al.  Efficient UAV video event summarization , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[68]  Jonathan P. How,et al.  The Impact of Human–Automation Collaboration in Decentralized Multiple Unmanned Vehicle Control , 2012, Proceedings of the IEEE.

[69]  F. Fraundorfer,et al.  Visual Odometry : Part II: Matching, Robustness, Optimization, and Applications , 2012, IEEE Robotics & Automation Magazine.

[70]  Friedrich Fraundorfer,et al.  Visual Odometry Part I: The First 30 Years and Fundamentals , 2022 .

[71]  É. Marchand,et al.  Chasing a moving target from a flying UAV , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[72]  Andrea Cavallaro,et al.  Multi-camera Scheduling for Video Production , 2011, 2011 Conference for Visual Media Production.

[73]  Marc Pollefeys,et al.  PIXHAWK: A system for autonomous flight using onboard computer vision , 2011, 2011 IEEE International Conference on Robotics and Automation.

[74]  Antonios Tsourdos,et al.  Cooperative Path Planning of Unmanned Aerial Vehicles: Tsourdos/Cooperative Path Planning of Unmanned Aerial Vehicles , 2010 .

[75]  N. Aouf,et al.  Robust cooperative UAV Visual SLAM , 2010, 2010 IEEE 9th International Conference on Cyberntic Intelligent Systems.

[76]  Mary L. Cummings,et al.  The Role of Human-Automation Consensus in Multiple Unmanned Vehicle Scheduling , 2010, Hum. Factors.

[77]  Wolfram Burgard,et al.  A Tutorial on Graph-Based SLAM , 2010, IEEE Intelligent Transportation Systems Magazine.

[78]  Michel Dhome,et al.  Generic and real-time structure from motion using local bundle adjustment , 2009, Image Vis. Comput..

[79]  Jean-Arcady Meyer,et al.  Real-time visual loop-closure detection , 2008, 2008 IEEE International Conference on Robotics and Automation.

[80]  Xin Huang,et al.  Scout: Outdoor Localization Using Active RFID Technology , 2006, 2006 3rd International Conference on Broadband Communications, Networks and Systems.

[81]  Terry Moore,et al.  What is the accuracy of DGPS? , 2005, Journal of Navigation.

[82]  Jianliang Tang,et al.  Complete Solution Classification for the Perspective-Three-Point Problem , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[83]  Mohinder S. Grewal,et al.  Global Positioning Systems, Inertial Navigation, and Integration , 2000 .

[84]  Gilbert Verghese Perspective Alignment Back Projection for Monocular Tracking of Solid Objects , 1993, BMVC.

[85]  Berthold K. P. Horn,et al.  Closed-form solution of absolute orientation using unit quaternions , 1987 .