Computer Vision – ECCV 2014

We introduce an approach for analyzing annotated maps of a site, together with Internet photos, to reconstruct large indoor spaces of famous tourist sites. While current 3D reconstruction algorithms often produce a set of disconnected components (3D pieces) for indoor scenes due to scene coverage or matching failures, we make use of a provided map to lay out the 3D pieces in a global coordinate system. Our approach leverages position, orientation, and shape cues extracted from the map and 3D pieces and optimizes a global objective to recover the global layout of the pieces. We introduce a novel crowd flow cue that measures how people move across the site to recover 3D geometry orientation. We show compelling results on major tourist sites.

[1]  Luke S. Zettlemoyer,et al.  3D Wikipedia , 2013, ACM Trans. Graph..

[2]  Cristian Sminchisescu,et al.  Training Deformable Models for Localization , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[3]  N. Ayache,et al.  Log‐Euclidean metrics for fast and simple calculus on diffusion tensors , 2006, Magnetic resonance in medicine.

[4]  Andrew Owens,et al.  SUN3D: A Database of Big Spaces Reconstructed Using SfM and Object Labels , 2013, 2013 IEEE International Conference on Computer Vision.

[5]  Jason Weston,et al.  Inference with the Universum , 2006, ICML.

[6]  Dan Ventura,et al.  The Hough Transform's Implicit Bayesian Foundation , 2007, 2007 IEEE International Conference on Image Processing.

[7]  Richard Szeliski,et al.  Building Rome in a day , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[8]  Vassilios Morellas,et al.  Tensor Sparse Coding for Region Covariances , 2010, ECCV.

[9]  Bianca Zadrozny,et al.  Transforming classifier scores into accurate multiclass probability estimates , 2002, KDD.

[10]  Anoop Cherian,et al.  Generalized Dictionary Learning for Symmetric Positive Definite Matrices with Application to Nearest Neighbor Retrieval , 2011, ECML/PKDD.

[11]  Bingpeng Ma,et al.  BiCov: a novel image representation for person re-identification and face verification , 2012, BMVC.

[12]  Richard Szeliski,et al.  Alignment of 3D point clouds to overhead images , 2009, 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[13]  Anderson Rocha,et al.  Toward Open Set Recognition , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  S. Sra Positive definite matrices and the S-divergence , 2011, 1110.1773.

[15]  Xuelong Li,et al.  Gabor-Based Region Covariance Matrices for Face Recognition , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[16]  Anderson Rocha,et al.  Meta-Recognition: The Theory and Practice of Recognition Score Analysis , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Michael Elad,et al.  Image Denoising Via Learned Dictionaries and Sparse representation , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[18]  Terrance E. Boult,et al.  Probability Models for Open Set Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Hongdong Li,et al.  Kernel Methods on the Riemannian Manifold of Symmetric Positive Definite Matrices , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Trimble,et al.  Fragments of the City: Stanfordʹs Digital Forma Urbis Romae Project , 2022 .

[22]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[23]  Kenneth I. Laws,et al.  Rapid Texture Identification , 1980, Optics & Photonics.

[24]  Rich Caruana,et al.  Predicting good probabilities with supervised learning , 2005, ICML.

[25]  Larry S. Davis,et al.  Learning Discriminative Appearance-Based Models Using Partial Least Squares , 2009, 2009 XXII Brazilian Symposium on Computer Graphics and Image Processing.

[26]  J. Borwein,et al.  Two-Point Step Size Gradient Methods , 1988 .

[27]  Larry H. Matthies,et al.  First-Person Activity Recognition: What Are They Doing to Me? , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Mark W. Schmidt,et al.  Optimizing Costly Functions with Simple Constraints: A Limited-Memory Projected Quasi-Newton Algorithm , 2009, AISTATS.

[29]  Ingo Steinwart,et al.  Sparseness of Support Vector Machines---Some Asymptotically Sharp Bounds , 2003, NIPS.

[30]  Janusz Konrad,et al.  Action Recognition Using Sparse Representation on Covariance Manifolds of Optical Flow , 2010, 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance.

[31]  Richard Szeliski,et al.  An integrated Bayesian approach to layer extraction from image sequences , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[32]  Rachid Deriche,et al.  Texture and color segmentation based on the combined use of the structure tensor and the image components , 2008, Signal Process..

[33]  Nicholas J. Higham,et al.  Functions of matrices - theory and computation , 2008 .

[34]  J. Hiriart-Urruty,et al.  Fundamentals of Convex Analysis , 2004 .

[35]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[36]  Blake LeBaron,et al.  Extreme Value Theory and Fat Tails in Equity Markets , 2005 .

[37]  Richard Szeliski,et al.  Visual odometry and map correlation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[38]  Xavier Pennec,et al.  A Riemannian Framework for Tensor Computing , 2005, International Journal of Computer Vision.

[39]  William T. Freeman,et al.  A probabilistic image jigsaw puzzle solver , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[40]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[41]  Baba C. Vemuri,et al.  On A Nonlinear Generalization of Sparse Coding and Dictionary Learning , 2013, ICML.

[42]  Erik D. Demaine,et al.  Jigsaw Puzzles, Edge Matching, and Polyomino Packing: Connections and Complexity , 2007, Graphs Comb..

[43]  Fatih Murat Porikli,et al.  Region Covariance: A Fast Descriptor for Detection and Classification , 2006, ECCV.

[44]  R. Bhatia Positive Definite Matrices , 2007 .

[45]  Afshin Dehghan,et al.  Part-based multiple-person tracking with partial occlusion handling , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[46]  Jianxiong Xiao,et al.  Reconstructing the World's Museums , 2012, ECCV.

[47]  José Mario Martínez,et al.  Algorithm 813: SPG—Software for Convex-Constrained Optimization , 2001, TOMS.

[48]  Lei Zhang,et al.  Log-Euclidean Kernels for Sparse Representation and Dictionary Learning , 2013, 2013 IEEE International Conference on Computer Vision.

[49]  Catherine Wah,et al.  Attribute-Based Detection of Unfamiliar Classes with Humans in the Loop , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[50]  Terrance E. Boult,et al.  Multi-attribute spaces: Calibration for attribute fusion and similarity search , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[51]  David J. Field,et al.  Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[52]  Vassilios Morellas,et al.  Dirichlet process mixture models on symmetric positive definite matrices for appearance clustering in video surveillance applications , 2011, CVPR 2011.

[53]  Andreas Geiger,et al.  Lost! Leveraging the Crowd for Probabilistic Visual Self-Localization , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[54]  Yuwei Wu,et al.  Affine Object Tracking Using Kernel-Based Region Covariance Descriptors , 2011 .

[55]  Steven M. Seitz,et al.  The Visual Turing Test for Scene Reconstruction , 2013, 2013 International Conference on 3D Vision.

[56]  Alexei A. Efros,et al.  Image sequence geolocation with human travel priors , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[57]  William T. Freeman,et al.  The Patch Transform , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[58]  Tanaya Guha,et al.  Learning Sparse Representations for Human Action Recognition , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[59]  Brian C. Lovell,et al.  Sparse Coding and Dictionary Learning for Symmetric Positive Definite Matrices: A Kernel Approach , 2012, ECCV.

[60]  Noah Snavely,et al.  Accurate Georegistration of Point Clouds Using Geographic Data , 2013, 2013 International Conference on 3D Vision.

[61]  Andrew C. Gallagher Jigsaw puzzles with pieces of unknown orientation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[62]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[63]  Robert P. W. Duin,et al.  Support Vector Data Description , 2004, Machine Learning.

[64]  Vassilios Morellas,et al.  Compact covariance descriptors in 3D point clouds for object recognition , 2012, 2012 IEEE International Conference on Robotics and Automation.

[65]  Anderson Rocha,et al.  Robust Fusion: Extreme Value Theory for Recognition Score Normalization , 2010, ECCV.

[66]  Dieter Fox,et al.  A large-scale hierarchical multi-view RGB-D object dataset , 2011, 2011 IEEE International Conference on Robotics and Automation.

[67]  Avideh Zakhor,et al.  Indoor localization and visualization using a human-operated backpack system , 2010, 2010 International Conference on Indoor Positioning and Indoor Navigation.