Scene Chronology

We present a new method for taking an urban scene reconstructed from a large Internet photo collection and reasoning about its change in appearance through time. Our method estimates when individual 3D points in the scene existed, then uses spatial and temporal affinity between points to segment the scene into spatio-temporally consistent clusters. The result of this segmentation is a set of spatio-temporal objects that often correspond to meaningful units, such as billboards, signs, street art, and other dynamic scene elements, along with estimates of when each existed. Our method is robust and scalable to scenes with hundreds of thousands of images and billions of noisy, individual point observations. We demonstrate our system on several large-scale scenes, and demonstrate an application to time stamping photos. Our work can serve to chronicle a scene over time, documenting its history and discovering dynamic elements in a way that can be easily explored and visualized.

[1]  Marc Pollefeys,et al.  Image based detection of geometric changes in urban environments , 2011, 2011 International Conference on Computer Vision.

[2]  Joseph L. Mundy,et al.  Change Detection in a 3-d World , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Silvio Savarese,et al.  Monitoring changes of 3D building elements from unordered photo collections , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[4]  Henning Biermann,et al.  Recovering non-rigid 3D shape from image streams , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[5]  Robert Pless,et al.  Consistent Temporal Variations in Many Outdoor Scenes , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Shai Avidan,et al.  Photo Sequencing , 2012, ECCV.

[7]  Marc Pollefeys,et al.  City-Scale Change Detection in Cadastral 3D Models Using Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  David Martin,et al.  Street View Motion-from-Structure-from-Motion , 2013, 2013 IEEE International Conference on Computer Vision.

[9]  Richard Szeliski,et al.  Building Rome in a day , 2009, ICCV.

[10]  Jan-Michael Frahm,et al.  Building Rome on a Cloudless Day , 2010, ECCV.

[11]  Wojciech Matusik,et al.  Factored time-lapse video , 2007, ACM Trans. Graph..

[12]  Marc Pollefeys,et al.  Unstructured video-based rendering: interactive exploration of casually captured videos , 2010, SIGGRAPH 2010.

[13]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[14]  Pascal Fua,et al.  Worldwide Pose Estimation Using 3D Point Clouds , 2012, ECCV.

[15]  Jean Ponce,et al.  Accurate, Dense, and Robust Multiview Stereopsis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Richard Szeliski,et al.  Modeling the World from Internet Photo Collections , 2008, International Journal of Computer Vision.

[17]  David G. Lowe,et al.  Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration , 2009, VISAPP.

[18]  Richard Szeliski,et al.  Piecewise planar stereo for image-based rendering , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[19]  Frank Dellaert,et al.  Inferring Temporal Order of Images From 3D Structure , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Yong Jae Lee,et al.  Style-Aware Mid-level Representation for Discovering Visual Connections in Space and Time , 2013, 2013 IEEE International Conference on Computer Vision.

[21]  Richard Szeliski,et al.  Towards Internet-scale multi-view stereo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[22]  Frédo Durand,et al.  Motion denoising with application to time-lapse photography , 2011, CVPR 2011.

[23]  Takeo Kanade,et al.  Three-dimensional scene flow , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Pascal Fua,et al.  Worldwide Pose Estimation Using 3D Point Clouds , 2012, ECCV.

[25]  Joseph L. Mundy,et al.  Dynamic Probabilistic Volumetric Models , 2013, 2013 IEEE International Conference on Computer Vision.

[26]  Frank Dellaert,et al.  GroupSAC: Efficient consensus in the presence of groupings , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[27]  Richard Szeliski,et al.  Reconstructing building interiors from images , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[28]  Frank Dellaert,et al.  Probabilistic temporal inference on reconstructed 3D scenes , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.