Querying Multiple Simultaneous Video Streams with 3D Interest Maps

With proliferation of mobile devices equipped with cameras and video recording applications, it is now common to observe multiple mobile cameras filming the same scene at an event from a diverse set of view angles. These recorded videos provide a rich set of data for someone to re-experience the event at a later time. Not all the videos recorded, however, show a desirable view. Navigating through a large collection of videos to find a video with a better viewing angle can be time consuming. We propose a query-response interface in which users can intuitively switch to another video with an alternate, better, view, by selecting a 2D region within a video as a query. The system would then response with another video that has a better view of the selected region, maximizing the viewpoint entropy. The key to our system is a lightweight 3D scene structure, also termed 3D interest map. A 3D interest map is naturally an extension of saliency maps in the 3D space since most users film what they find interesting from their respective viewpoints. A user study with more than 35 users shows that our video query system achieves a suitable compromise between accuracy and run-time.

[1]  Richard I. Hartley,et al.  Iterative Extensions of the Sturm/Triggs Algorithm: Convergence and Nonconvergence , 2007, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Mateu Sbert,et al.  Viewpoint Selection using Viewpoint Entropy , 2001, VMV.

[3]  Hans Weda,et al.  Synchronization of Multiple Camera Videos Using Audio-Visual Features , 2010, IEEE Transactions on Multimedia.

[4]  Chaoli Wang,et al.  Information Theory in Scientific Visualization , 2011, Entropy.

[5]  Katsumi Aoki,et al.  Recent development of flow visualization , 2004, J. Vis..

[6]  Jean Ponce,et al.  Accurate, Dense, and Robust Multiview Stereopsis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Huang Lee,et al.  Sub-optimal Camera Selection in Practical Vision Networks through Shape Approximation , 2008, ACIVS.

[8]  Wei Tsang Ooi,et al.  3D Interest Maps From Simultaneous Video Recordings , 2014, ACM Multimedia.

[9]  Feng Zhao,et al.  Location and Mobility in a Sensor Network of Mobile Phones , 2007 .

[10]  Andrew Zisserman,et al.  Multiple View Geometry in Computer Vision (2nd ed) , 2003 .

[11]  Peter H. N. de With,et al.  Automatic mashup generation from multiple-camera concert recordings , 2010, ACM Multimedia.

[12]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Wei Tsang Ooi,et al.  MoViMash: online mobile video mashup , 2012, ACM Multimedia.

[14]  Niloofar Dezfuli,et al.  CoStream: in-situ co-construction of shared experiences through mobile video sharing during live events , 2012, CHI EA '12.

[15]  Huang Lee,et al.  Principal view determination for camera selection in distributed smart camera networks , 2008, 2008 Second ACM/IEEE International Conference on Distributed Smart Cameras.

[16]  Benjamin W Tatler,et al.  The central fixation bias in scene viewing: selecting an optimal viewing position independently of motor biases and image feature distributions. , 2007, Journal of vision.

[17]  Wei Tsang Ooi,et al.  The jiku mobile video dataset , 2013, MMSys.

[18]  Leonidas J. Guibas,et al.  Optimal Placement and Selection of Camera Network Nodes for Target Localization , 2006, DCOSS.

[19]  Derek Hoiem,et al.  Computer vision for music identification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[20]  Jun-Sik Kim,et al.  Geometric and algebraic constraints of projected concentric circles and their applications to camera calibration , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Pierre Gurdjos,et al.  Camera tracking using concentric circle markers: Paradigms and algorithms , 2012, 2012 19th IEEE International Conference on Image Processing.

[22]  Marc Pollefeys,et al.  Live Metric 3D Reconstruction on Mobile Phones , 2013, 2013 IEEE International Conference on Computer Vision.

[23]  Noel E. O'Connor,et al.  Automatic camera selection for activity monitoring in a multi-camera system for tennis , 2009, 2009 Third ACM/IEEE International Conference on Distributed Smart Cameras (ICDSC).

[24]  Reinhard Koch,et al.  Visual Modeling with a Hand-Held Camera , 2004, International Journal of Computer Vision.

[25]  Peter F. Sturm,et al.  Algorithms for plane-based pose estimation , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[26]  Steven M. Seitz,et al.  Photo tourism: exploring photo collections in 3D , 2006, ACM Trans. Graph..