EgoSampling: Wide View Hyperlapse From Egocentric Videos

The possibility of sharing one’s point of view makes the use of wearable cameras compelling. These videos are often long, boring, and coupled with extreme shaking, as the camera is worn on a moving person. Fast-forwarding (i.e., frame sampling) is a natural choice for quick video browsing. However, this accentuates the shake caused by natural head motion in an egocentric video, making the fast-forwarded video useless. We propose EgoSampling, an adaptive frame sampling that gives stable, fast-forwarded, hyperlapse videos. Adaptive frame sampling is formulated as an energy minimization problem, whose optimal solution can be found in polynomial time. We further turn the camera shake from a drawback into a feature, enabling the increase in field of view of the output video. This is obtained when each output frame is mosaiced from several input frames. The proposed technique also enables the generation of a single hyperlapse video from multiple egocentric videos, allowing even faster video consumption.

[1]  Nebojsa Jojic,et al.  Adaptive Video Fast Forward , 2005, Multimedia Tools and Applications.

[2]  Yong Jae Lee,et al.  Discovering important people and objects for egocentric video summarization , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Michael Werman,et al.  Fast and robust Earth Mover's Distances , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[4]  Shmuel Peleg,et al.  EgoSampling: Fast-forward and stereo for egocentric videos , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Harry Shum,et al.  Full-frame video stabilization with motion inpainting , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Shmuel Peleg,et al.  Temporal Segmentation of Egocentric Videos , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Jian Sun,et al.  SteadyFlow: Spatially Smooth Optical Flow for Video Stabilization , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Michael F. Cohen,et al.  Real-time hyperlapse creation via optimal frame selection , 2015, ACM Trans. Graph..

[9]  Davide Scaramuzza,et al.  SVO: Fast semi-direct monocular visual odometry , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[10]  Jiajun Bu,et al.  Video stabilization with a depth camera , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Shmuel Peleg,et al.  Compact CNN for indexing egocentric videos , 2015, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[12]  Irfan A. Essa,et al.  Auto-directed video stabilization with robust L1 optimal camera paths , 2011, CVPR 2011.

[13]  Shmuel Peleg,et al.  Wisdom of the Crowd in Egocentric Video Curation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[14]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[15]  Michael Gleicher,et al.  Content-preserving warps for 3D video stabilization , 2009, ACM Trans. Graph..

[16]  James M. Rehg,et al.  Social interactions: A first-person perspective , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Jian Sun,et al.  Bundled camera paths for video stabilization , 2013, ACM Trans. Graph..

[18]  Roland Siegwart,et al.  A Toolbox for Easily Calibrating Omnidirectional Cameras , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[19]  Larry H. Matthies,et al.  First-Person Activity Recognition: What Are They Doing to Me? , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  James M. Rehg,et al.  Gaze-enabled egocentric video summarization via constrained submodular maximization , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Richard Szeliski,et al.  First-person hyper-lapse videos , 2014, ACM Trans. Graph..

[22]  Michael Cohen,et al.  First-person Hyperlapse Videos , 2014, SIGGRAPH 2014.

[23]  Raanan Fattal,et al.  Video stabilization using epipolar geometry , 2012, TOGS.

[24]  Wei Jiang,et al.  Video stitching with spatial-temporal content-preserving warping , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[25]  Shmuel Peleg,et al.  An Egocentric Look at Video Photographer Identity , 2014, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Pietro Perona,et al.  Automating joiners , 2007, NPAR '07.

[27]  Larry H. Matthies,et al.  Pooled motion features for first-person videos , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Michael Gleicher,et al.  Subspace video stabilization , 2011, TOGS.

[29]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[30]  Ehud Rivlin,et al.  Finding the focus of expansion and estimating range using optical flow images and a matched filter , 2004, Machine Vision and Applications.

[31]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[32]  Andrew Zisserman,et al.  Multiple View Geometry in Computer Vision (2nd ed) , 2003 .

[33]  Daniel Cremers,et al.  LSD-SLAM: Large-Scale Direct Monocular SLAM , 2014, ECCV.

[34]  Yaser Sheikh,et al.  Automatic editing of footage from multiple social cameras , 2014, ACM Trans. Graph..

[35]  Kristen Grauman,et al.  Intentional Photos from an Unintentional Photographer: Detecting Snap Points in Egocentric Video with a Web Photo Prior , 2014, Mobile Cloud Visual Media Computing.

[36]  Takahiro Okabe,et al.  Fast unsupervised ego-action learning for first-person sports videos , 2011, CVPR 2011.

[37]  Kristen Grauman,et al.  Story-Driven Summarization for Egocentric Video , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.