Scene understanding with tri-training

Scene understanding needs not only detecting objects in the scene, but also obtaining the relationship between the objects and the scene, for example the reasonable size and occurrence possibility of objects at one position in the scene. With this relationship, the traditional object detection approach, which may misclassify objects with wrong sizes or position of the scene, can be greatly improved. In this paper, a novel scale model is proposed to describe the understanding of the scene. The scale model consists of the occurrence possibility and the reasonable size of pedestrian in each position of the scene. The scale model is learned by counting the pedestrian examples with different sizes in different positions of the scene for a period of time, instead of computing the geometry and viewpoint information in a single image. The examples are detected automatically by a detector which is trained with tri-training based semi-supervised approach. Experimental results indicate that the scale model of the scene can be learned with semi-supervised detection without the information of the 3D geometry and the assumption of plain ground.

[1]  Alexei A. Efros,et al.  Geometric context from a single image , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[2]  Martial Hebert,et al.  A hierarchical field framework for unified context-based classification , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[3]  Alexei A. Efros,et al.  Putting Objects in Perspective , 2006, CVPR.

[4]  Heinrich Niemann,et al.  Statistical modeling and performance characterization of a real-time dual camera surveillance system , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[5]  Martial Hebert,et al.  Discriminative random fields: a discriminative framework for contextual interaction in classification , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[6]  Antonio Torralba,et al.  Statistical Context Priming for Object Detection , 2001, ICCV.

[7]  Zhi-Hua Zhou,et al.  Tri-training: exploiting unlabeled data using three classifiers , 2005, IEEE Transactions on Knowledge and Data Engineering.

[8]  Hideki Hashimoto,et al.  Real-time lane detection for autonomous vehicle , 2001, ISIE 2001. 2001 IEEE International Symposium on Industrial Electronics Proceedings (Cat. No.01TH8570).

[9]  Vinod Nair,et al.  An unsupervised, online learning framework for moving object detection , 2004, CVPR 2004.

[10]  Antonio Torralba,et al.  Learning hierarchical models of scenes, objects, and parts , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[11]  Henry Schneiderman,et al.  Learning a restricted Bayesian network for object detection , 2004, CVPR 2004.

[12]  Antonio Torralba,et al.  Depth Estimation from Image Structure , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[14]  Paulo R. S. Mendonça,et al.  Bayesian autocalibration for surveillance , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.