论文信息 - Computer Vision – ECCV 2014

Computer Vision – ECCV 2014

We introduce an approach for analyzing annotated maps of a site, together with Internet photos, to reconstruct large indoor spaces of famous tourist sites. While current 3D reconstruction algorithms often produce a set of disconnected components (3D pieces) for indoor scenes due to scene coverage or matching failures, we make use of a provided map to lay out the 3D pieces in a global coordinate system. Our approach leverages position, orientation, and shape cues extracted from the map and 3D pieces and optimizes a global objective to recover the global layout of the pieces. We introduce a novel crowd flow cue that measures how people move across the site to recover 3D geometry orientation. We show compelling results on major tourist sites.

[1] Luke S. Zettlemoyer,et al. 3D Wikipedia , 2013, ACM Trans. Graph..

[2] Cristian Sminchisescu,et al. Training Deformable Models for Localization , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[3] N. Ayache,et al. Log‐Euclidean metrics for fast and simple calculus on diffusion tensors , 2006, Magnetic resonance in medicine.

[4] Andrew Owens,et al. SUN3D: A Database of Big Spaces Reconstructed Using SfM and Object Labels , 2013, 2013 IEEE International Conference on Computer Vision.

[5] Jason Weston,et al. Inference with the Universum , 2006, ICML.

[6] Dan Ventura,et al. The Hough Transform's Implicit Bayesian Foundation , 2007, 2007 IEEE International Conference on Image Processing.

[7] Richard Szeliski,et al. Building Rome in a day , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[8] Vassilios Morellas,et al. Tensor Sparse Coding for Region Covariances , 2010, ECCV.

[9] Bianca Zadrozny,et al. Transforming classifier scores into accurate multiclass probability estimates , 2002, KDD.

[10] Anoop Cherian,et al. Generalized Dictionary Learning for Symmetric Positive Definite Matrices with Application to Nearest Neighbor Retrieval , 2011, ECML/PKDD.

[11] Bingpeng Ma,et al. BiCov: a novel image representation for person re-identification and face verification , 2012, BMVC.

[12] Richard Szeliski,et al. Alignment of 3D point clouds to overhead images , 2009, 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[13] Anderson Rocha,et al. Toward Open Set Recognition , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14] S. Sra. Positive definite matrices and the S-divergence , 2011, 1110.1773.

[15] Xuelong Li,et al. Gabor-Based Region Covariance Matrices for Face Recognition , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[16] Anderson Rocha,et al. Meta-Recognition: The Theory and Practice of Recognition Score Analysis , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17] Michael Elad,et al. Image Denoising Via Learned Dictionaries and Sparse representation , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[18] Terrance E. Boult,et al. Probability Models for Open Set Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19] Hongdong Li,et al. Kernel Methods on the Riemannian Manifold of Symmetric Positive Definite Matrices , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[20] Allen Y. Yang,et al. Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21] Trimble,et al. Fragments of the City: Stanfordʹs Digital Forma Urbis Romae Project , 2022 .

[22] Vladimir Vapnik,et al. Statistical learning theory , 1998 .

[23] Kenneth I. Laws,et al. Rapid Texture Identification , 1980, Optics & Photonics.

[24] Rich Caruana,et al. Predicting good probabilities with supervised learning , 2005, ICML.

[25] Larry S. Davis,et al. Learning Discriminative Appearance-Based Models Using Partial Least Squares , 2009, 2009 XXII Brazilian Symposium on Computer Graphics and Image Processing.

[26] J. Borwein,et al. Two-Point Step Size Gradient Methods , 1988 .

[27] Larry H. Matthies,et al. First-Person Activity Recognition: What Are They Doing to Me? , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[28] Mark W. Schmidt,et al. Optimizing Costly Functions with Simple Constraints: A Limited-Memory Projected Quasi-Newton Algorithm , 2009, AISTATS.

[29] Ingo Steinwart,et al. Sparseness of Support Vector Machines---Some Asymptotically Sharp Bounds , 2003, NIPS.

[30] Janusz Konrad,et al. Action Recognition Using Sparse Representation on Covariance Manifolds of Optical Flow , 2010, 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance.

[31] Richard Szeliski,et al. An integrated Bayesian approach to layer extraction from image sequences , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[32] Rachid Deriche,et al. Texture and color segmentation based on the combined use of the structure tensor and the image components , 2008, Signal Process..

[33] Nicholas J. Higham,et al. Functions of matrices - theory and computation , 2008 .

[34] J. Hiriart-Urruty,et al. Fundamentals of Convex Analysis , 2004 .

[35] Bernhard Schölkopf,et al. Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[36] Blake LeBaron,et al. Extreme Value Theory and Fat Tails in Equity Markets , 2005 .

[37] Richard Szeliski,et al. Visual odometry and map correlation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[38] Xavier Pennec,et al. A Riemannian Framework for Tensor Computing , 2005, International Journal of Computer Vision.

[39] William T. Freeman,et al. A probabilistic image jigsaw puzzle solver , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[40] Yihong Gong,et al. Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[41] Baba C. Vemuri,et al. On A Nonlinear Generalization of Sparse Coding and Dictionary Learning , 2013, ICML.

[42] Erik D. Demaine,et al. Jigsaw Puzzles, Edge Matching, and Polyomino Packing: Connections and Complexity , 2007, Graphs Comb..

[43] Fatih Murat Porikli,et al. Region Covariance: A Fast Descriptor for Detection and Classification , 2006, ECCV.

[44] R. Bhatia. Positive Definite Matrices , 2007 .

[45] Afshin Dehghan,et al. Part-based multiple-person tracking with partial occlusion handling , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[46] Jianxiong Xiao,et al. Reconstructing the World's Museums , 2012, ECCV.

[47] José Mario Martínez,et al. Algorithm 813: SPG—Software for Convex-Constrained Optimization , 2001, TOMS.

[48] Lei Zhang,et al. Log-Euclidean Kernels for Sparse Representation and Dictionary Learning , 2013, 2013 IEEE International Conference on Computer Vision.

[49] Catherine Wah,et al. Attribute-Based Detection of Unfamiliar Classes with Humans in the Loop , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[50] Terrance E. Boult,et al. Multi-attribute spaces: Calibration for attribute fusion and similarity search , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[51] David J. Field,et al. Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[52] Vassilios Morellas,et al. Dirichlet process mixture models on symmetric positive definite matrices for appearance clustering in video surveillance applications , 2011, CVPR 2011.

[53] Andreas Geiger,et al. Lost! Leveraging the Crowd for Probabilistic Visual Self-Localization , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[54] Yuwei Wu,et al. Affine Object Tracking Using Kernel-Based Region Covariance Descriptors , 2011 .

[55] Steven M. Seitz,et al. The Visual Turing Test for Scene Reconstruction , 2013, 2013 International Conference on 3D Vision.

[56] Alexei A. Efros,et al. Image sequence geolocation with human travel priors , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[57] William T. Freeman,et al. The Patch Transform , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[58] Tanaya Guha,et al. Learning Sparse Representations for Human Action Recognition , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[59] Brian C. Lovell,et al. Sparse Coding and Dictionary Learning for Symmetric Positive Definite Matrices: A Kernel Approach , 2012, ECCV.

[60] Noah Snavely,et al. Accurate Georegistration of Point Clouds Using Geographic Data , 2013, 2013 International Conference on 3D Vision.

[61] Andrew C. Gallagher. Jigsaw puzzles with pieces of unknown orientation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[62] John Platt,et al. Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[63] Robert P. W. Duin,et al. Support Vector Data Description , 2004, Machine Learning.

[64] Vassilios Morellas,et al. Compact covariance descriptors in 3D point clouds for object recognition , 2012, 2012 IEEE International Conference on Robotics and Automation.

[65] Anderson Rocha,et al. Robust Fusion: Extreme Value Theory for Recognition Score Normalization , 2010, ECCV.

[66] Dieter Fox,et al. A large-scale hierarchical multi-view RGB-D object dataset , 2011, 2011 IEEE International Conference on Robotics and Automation.

[67] Avideh Zakhor,et al. Indoor localization and visualization using a human-operated backpack system , 2010, 2010 International Conference on Indoor Positioning and Indoor Navigation.