FlyCap: Markerless Motion Capture Using Multiple Autonomous Flying Cameras

Aiming at automatic, convenient and non-instrusive motion capture, this paper presents a new generation markerless motion capture technique, the FlyCap system, to capture surface motions of moving characters using multiple autonomous flying cameras (autonomous unmanned aerial vehicles(UAVs) each integrated with an RGBD video camera). During data capture, three cooperative flying cameras automatically track and follow the moving target who performs large-scale motions in a wide space. We propose a novel non-rigid surface registration method to track and fuse the depth of the three flying cameras for surface motion tracking of the moving target, and simultaneously calculate the pose of each flying camera. We leverage the using of visual-odometry information provided by the UAV platform, and formulate the surface tracking problem in a non-linear objective function that can be linearized and effectively minimized through a Gaussian-Newton method. Quantitative and qualitative experimental results demonstrate the plausible surface and motion reconstruction results.

[1]  Dieter Fox,et al.  DynamicFusion: Reconstruction and tracking of non-rigid scenes in real-time , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Jean-Yves Guillemaut,et al.  General Dynamic Scene Reconstruction from Multiple View Video , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[3]  Real-Time Geometry, Albedo, and Motion Reconstruction Using a Single RGB-D Camera , 2017 .

[4]  Andrew Blake,et al.  Articulated body motion capture by annealed particle filtering , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[5]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[6]  Jinxiang Chai,et al.  Accurate realtime full-body motion capture using a single depth camera , 2012, ACM Trans. Graph..

[7]  Wojciech Matusik,et al.  Articulated mesh animation from multi-view silhouettes , 2008, ACM Trans. Graph..

[8]  DaiQionghai,et al.  Markerless Motion Capture of Multiple Characters Using Multiview Image Segmentation , 2013 .

[9]  Pradeep Gaidhani,et al.  Vision Based Object Detection And Tracking Using Multirotor Unmanned Aerial Vehicle , 2014 .

[10]  Pascal Fua,et al.  Dense Image Registration and Deformable Surface Reconstruction in Presence of Occlusions and Minimal Texture , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[11]  Andrew W. Fitzgibbon,et al.  3D scanning deformable objects with a single RGBD sensor , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Matthias Nießner,et al.  BundleFusion , 2016, TOGS.

[13]  Qionghai Dai,et al.  Outdoor Markerless Motion Capture with Sparse Handheld Video Cameras , 2018, IEEE Transactions on Visualization and Computer Graphics.

[14]  Hans-Peter Seidel,et al.  Fast articulated motion tracking using a sums of Gaussians body model , 2011, 2011 International Conference on Computer Vision.

[15]  Gérard G. Medioni,et al.  Capturing Dynamic Textured Surfaces of Moving Targets , 2016, ECCV.

[16]  Andrew W. Fitzgibbon,et al.  Real-time non-rigid reconstruction using an RGB-D camera , 2014, ACM Trans. Graph..

[17]  Andrew W. Fitzgibbon,et al.  Efficient regression of general-activity human poses from depth images , 2011, 2011 International Conference on Computer Vision.

[18]  Lu Fang,et al.  Guidance: A visual sensing platform for robotic applications , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[19]  Hans-Peter Seidel,et al.  Markerless Motion Capture with unsynchronized moving cameras , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Matthias Nießner,et al.  VolumeDeform: Real-Time Volumetric Non-rigid Reconstruction , 2016, ECCV.

[21]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[22]  C. Karen Liu,et al.  Leveraging depth cameras and wearable pressure sensors for full-body kinematics and dynamics capture , 2014, ACM Trans. Graph..

[23]  Nassir Navab,et al.  Multiple-Activity Human Body Tracking in Unconstrained Environments , 2010, AMDO.

[24]  Toon Goedemé,et al.  On-board real-time tracking of pedestrians on a UAV , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[25]  Alvaro Collet,et al.  High-quality streamable free-viewpoint video , 2015, ACM Trans. Graph..

[26]  Qionghai Dai,et al.  Performance Capture of Interacting Characters with Handheld Kinects , 2012, ECCV.

[27]  Sebastian Thrun,et al.  Real time motion capture using a single time-of-flight camera , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[28]  Ramesh Raskar,et al.  Prakash: lighting aware motion capture using photosensing markers and multiplexed illuminators , 2007, ACM Trans. Graph..

[29]  Jean-Yves Guillemaut,et al.  Temporally Coherent 4D Reconstruction of Complex Dynamic Scenes , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Leonidas J. Guibas,et al.  Robust single-view geometry and motion reconstruction , 2009, ACM Trans. Graph..

[31]  Daniel Cremers,et al.  A primal-dual framework for real-time dense RGB-D scene flow , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[32]  Tao Yu,et al.  Real-time geometry, albedo and motion reconstruction using a single RGBD camera , 2017, TOGS.

[33]  Qionghai Dai,et al.  Robust Non-rigid Motion Tracking and Surface Reconstruction Using L0 Regularization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[34]  M. Pauly,et al.  Embedded deformation for shape manipulation , 2007, SIGGRAPH 2007.

[35]  Jitendra Malik,et al.  Twist Based Acquisition and Tracking of Animal and Human Kinematics , 2004, International Journal of Computer Vision.

[36]  Christian Theobalt,et al.  On-set performance capture of multiple actors with a stereo camera , 2013, ACM Trans. Graph..

[37]  Qionghai Dai,et al.  Multiview video depth estimation with spatial-temporal consistency , 2010, BMVC.

[38]  Hans-Peter Seidel,et al.  Markerless motion capture of interacting characters using multi-view image segmentation , 2011, CVPR 2011.

[39]  Daniel Cremers,et al.  Dense visual SLAM for RGB-D cameras , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[40]  Hans-Peter Seidel,et al.  Performance capture from sparse multi-view video , 2008, ACM Trans. Graph..

[41]  Lu Fang,et al.  Monocular Long-Term Target Following on UAVs , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[42]  Hans-Peter Seidel,et al.  Shading-based dynamic shape refinement from multi-view video under general illumination , 2011, 2011 International Conference on Computer Vision.

[43]  Yaser Sheikh,et al.  OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Yaser Sheikh,et al.  Motion capture from body-mounted cameras , 2011, ACM Trans. Graph..

[45]  Jovan Popović,et al.  Practical motion capture in everyday surroundings , 2007, ACM Trans. Graph..

[46]  H. Woltring,et al.  New possibilities for human motion studies by real-time light spot position measurement. , 1974, Biotelemetry.

[47]  Pushmeet Kohli,et al.  Fusion4D , 2016, ACM Trans. Graph..

[48]  Takeo Kanade,et al.  Panoptic Studio: A Massively Multiview System for Social Motion Capture , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[49]  Hans-Peter Seidel,et al.  Motion capture using joint skeleton tracking and surface estimation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[50]  Stefan Leutenegger,et al.  ElasticFusion: Real-time dense SLAM and light source estimation , 2016, Int. J. Robotics Res..

[51]  Yongtian Wang,et al.  Deformable 3D Fusion: From Partial Dynamic 3D Observations to Complete 4D Models , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[52]  Hans-Peter Seidel,et al.  Markerless Motion Capture of Multiple Characters Using Multiview Image Segmentation , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.