Statistics of 3D Object Locations in Images

The structural design of buildings and streets in man-made outdoor environments determine the possible locations of other objects like cars or people. Cars drive or park on the streets, and people walk on the sidewalks. Their space of interaction is limited and aligned to these man-made structures. In this report, we describe how 3D locations of objects can be estimated in a single view image. We gather statistics to analyze the locations of cars and people and show how their alignment to man-made structures creates regularities, which can be used to predict their locations. We show which prior knowledge of the analyzed scene is necessary for different ways of statistical prediction of object locations and evaluate the prediction performance on a data set with urban outdoor scenes.

[1]  Alexei A. Efros,et al.  Putting Objects in Perspective , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[2]  Alan L. Yuille,et al.  Manhattan World: compass direction from a single image by Bayesian inference , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[3]  Tomaso A. Poggio,et al.  A Trainable System for Object Detection , 2000, International Journal of Computer Vision.

[4]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[5]  Seth J. Teller,et al.  Automatic recovery of relative camera rotations for urban scenes , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[6]  Martial Hebert,et al.  Discriminative random fields: a discriminative framework for contextual interaction in classification , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[7]  Antonio Torralba,et al.  Building the gist of a scene: the role of global image features in recognition. , 2006, Progress in brain research.

[8]  Zdravko I. Botev A Novel Nonparametric Density Estimator , 2006 .

[9]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[10]  Ian D. Reid,et al.  Single View Metrology , 2000, International Journal of Computer Vision.

[11]  Ashutosh Saxena,et al.  3-D Depth Reconstruction from a Single Still Image , 2007, International Journal of Computer Vision.

[12]  Sven Utcke Grouping based on projective geometry constraints and uncertainty , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[13]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[14]  Jaime López-Krahe,et al.  Contribution to the Determination of Vanishing Points Using Hough Transform , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Arnold W. M. Smeulders,et al.  Depth Information by Stage Classification , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[16]  Alexei A. Efros,et al.  Geometric context from a single image , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[17]  Alexei A. Efros,et al.  Scene completion using millions of photographs , 2008, Commun. ACM.

[18]  Antonio Torralba,et al.  Using the Forest to See the Trees: A Graphical Model Relating Features, Objects, and Scenes , 2003, NIPS.

[19]  Beatrice Brillault-O'Mahony,et al.  New method for vanishing point detection , 1991, CVGIP Image Underst..

[20]  Robert T. Collins,et al.  Vanishing point calculation as a statistical inference on the unit sphere , 1990, [1990] Proceedings Third International Conference on Computer Vision.

[21]  Antonio Torralba,et al.  Contextual Priming for Object Detection , 2003, International Journal of Computer Vision.

[22]  Wei Zhang,et al.  Video Compass , 2002, ECCV.

[23]  Olivier Faugeras,et al.  Three-Dimensional Computer Vision , 1993 .