Understanding real-world scenes for human-like machine perception

The rise of autonomous machines in our day-to-day lives has led to an increasing demand for machine perception of real-world to be more robust, accurate and human-like. The research in visual scene un- derstanding over the past two decades has focused on machine perception in controlled environments such as indoor, static and rigid objects. There is a gap in literature for machine perception in general complex scenes (outdoor with multiple interacting people). The proposed research ad- dresses the limitations of existing methods by proposing an unsupervised framework to simultaneously model, semantically segment and estimate motion for general dynamic scenes captured from multiple view videos with a network of static or moving cameras. In this talk I will explain the proposed joint framework to understand general dynamic scenes for ma- chine perception; give a comprehensive performance evaluation against state-of-the-art techniques on challenging indoor and outdoor sequences; and demonstrate applications such as virtual, augmented, mixed reality (VR/AR/MR) and broadcast production (Free-view point video - FVV).

[1]  Jean-Yves Guillemaut,et al.  Outdoor Dynamic 3D Scene Reconstruction , 2012 .

[2]  Marc Pollefeys,et al.  Unstructured video-based rendering: interactive exploration of casually captured videos , 2010, SIGGRAPH 2010.

[3]  Jean-Yves Guillemaut,et al.  4D Temporally Coherent Light-Field Video , 2017, 2017 International Conference on 3D Vision (3DV).

[4]  Adrian Hilton,et al.  Semantically Coherent Co-Segmentation and Reconstruction of Dynamic Scenes , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).