Projectable classifiers for multi-view object class recognition

We propose a multi-view object class modeling framework based on a simplified camera model and surfels (defined by a location and normal direction in a normalized 3D coordinate system) that mediate coarse correspondences between different views. Weak classifiers are learnt relative to the reference frames provided by the surfels. We describe a weak classifier that uses contour information when its corresponding surfel projects to a contour element in the image and color information when the face of the surfel is visible in the image. We emphasize that these weak classifiers can possibly take many different forms and use many different image features. Weak classifiers are combined using AdaBoost. We evaluate the method on a public dataset [8], showing promising results on categorization, recognition/ detection, pose estimation and image synthesis.

[1]  Silvio Savarese,et al.  Learning a dense multi-view representation for detection, viewpoint classification and synthesis of object categories , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[2]  Pascal Fua,et al.  Joint pose estimator and feature learning for object detection , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[3]  Jitendra Malik,et al.  Learning to detect natural image boundaries using local brightness, color, and texture cues , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[5]  Xavier Perrotton,et al.  Implicit hierarchical boosting for multi-view object detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  D. Geman,et al.  Stationary Features and Cat Detection , 2008 .

[7]  David G. Kirkpatrick,et al.  Linear Time Euclidean Distance Algorithms , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[9]  Bernt Schiele,et al.  Analyzing appearance and contour based methods for object categorization , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[10]  Cordelia Schmid,et al.  Viewpoint-independent object class detection using 3D Feature Maps , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Antonio Torralba,et al.  Part and appearance sharing: Recursive Compositional Models for multi-view , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[12]  Silvio Savarese,et al.  3D generic object categorization, localization and pose estimation , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[13]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[14]  Cordelia Schmid,et al.  Multi-view object class detection with a 3D geometric model , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  Luc Van Gool,et al.  Backprojection Revisited: Scalable Multi-view Object Detection and Similarity Metrics for Detections , 2010, ECCV.

[16]  Y. LeCun,et al.  Learning methods for generic object recognition with invariance to pose and lighting , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[17]  Bernt Schiele,et al.  Robust Object Detection with Interleaved Categorization and Segmentation , 2008, International Journal of Computer Vision.

[18]  Antonio Torralba,et al.  Sharing Visual Features for Multiclass and Multiview Object Detection , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[20]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.