Probabilistic Parameter Selection for Learning Scene Structure from Video

We present an online learning approach for robustly combining unreliable observations from a pedestrian detector to estimate the rough 3D scene geometry from video sequences of a static camera. Our approach is based on an entropy modelling framework, which allows to simultaneously adapt the detector parameters, such that the expected information gain about the scene structure is maximised. As a result, our approach automatically restricts the detector scale range for each image region as the estimation results become more confident, thus improving detector run-time and limiting false positives.

[1]  Alexei A. Efros,et al.  Geometric context from a single image , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[2]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[3]  Ian D. Reid,et al.  Single View Metrology , 2000, International Journal of Computer Vision.

[4]  Frank Dellaert,et al.  Atlanta world: an expectation maximization framework for simultaneous low-level edge grouping and camera calibration in complex man-made environments , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[5]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[6]  Michael C. Horsch,et al.  Dynamic Bayesian networks , 1990 .

[7]  Alexei A. Efros,et al.  Putting Objects in Perspective , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[8]  Joachim Denzler,et al.  Information Theoretic Sensor Data Selection for Active Object Recognition and State Estimation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Kevin Murphy,et al.  Dynamic Bayesian Networks , 2002 .

[10]  G. Sapiro,et al.  What Can Casual Walkers Tell Us About The 3 D Scene , 2007 .

[11]  Luc Van Gool,et al.  Dynamic 3D Scene Analysis from a Moving Vehicle , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Ashutosh Saxena,et al.  Make3D: Learning 3D Scene Structure from a Single Still Image , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Dan Roth,et al.  Learning to detect objects in images via a sparse, part-based representation , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Ping-Sing Tsai,et al.  Shape from Shading: A Survey , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Wei Zhang,et al.  Video Compass , 2002, ECCV.

[16]  Guillermo Sapiro,et al.  What Can Casual Walkers Tell Us About A 3D Scene? , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[17]  Paulo R. S. Mendonça,et al.  Autocalibration from Tracks of Walking People , 2006, BMVC.

[18]  Antonio Torralba,et al.  Statistical Context Priming for Object Detection , 2001, ICCV.

[19]  James M. Coughlan,et al.  Manhattan World: Orientation and Outlier Detection by Bayesian Inference , 2003, Neural Computation.

[20]  Honglak Lee,et al.  A Dynamic Bayesian Network Model for Autonomous 3D Reconstruction from a Single Indoor Image , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[21]  Paulo R. S. Mendonça,et al.  Bayesian autocalibration for surveillance , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[22]  N. Goodwin,et al.  Learning to Detect Objects in Images via a Sparse, Part-Based Representation , 2004 .