Learning people detection models from few training samples

People detection is an important task for a wide range of applications in computer vision. State-of-the-art methods learn appearance based models requiring tedious collection and annotation of large data corpora. Also, obtaining data sets representing all relevant variations with sufficient accuracy for the intended application domain at hand is often a non-trivial task. Therefore this paper investigates how 3D shape models from computer graphics can be leveraged to ease training data generation. In particular we employ a rendering-based reshaping method in order to generate thousands of synthetic training samples from only a few persons and views. We evaluate our data generation method for two different people detection models. Our experiments on a challenging multi-view dataset indicate that the data from as few as eleven persons suffices to achieve good performance. When we additionally combine our synthetic training samples with real data we even outperform existing state-of-the-art methods.

[1]  Jitendra Malik,et al.  Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[2]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[3]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[4]  David G. Lowe,et al.  Three-Dimensional Object Recognition from Single Two-Dimensional Images , 1987, Artif. Intell..

[5]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[6]  Martin A. Fischler,et al.  The Representation and Matching of Pictorial Structures , 1973, IEEE Transactions on Computers.

[7]  Andrew Zisserman,et al.  Progressive search space reduction for human pose estimation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Dariu Gavrila,et al.  Monocular Pedestrian Detection: Survey and Experiments , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Ivan Laptev,et al.  Improving object detection with boosted histograms , 2009, Image Vis. Comput..

[10]  Sebastian Thrun,et al.  SCAPE: shape completion and animation of people , 2005, SIGGRAPH 2005.

[11]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  D. Marr,et al.  Representation and recognition of the spatial organization of three-dimensional shapes , 1978, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[13]  Hans-Peter Seidel,et al.  A Statistical Model of Human Pose and Body Shape , 2009, Comput. Graph. Forum.

[14]  Hans-Peter Seidel,et al.  MovieReshape: tracking and reshaping of humans in videos , 2010, ACM Trans. Graph..

[15]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[16]  Pietro Perona,et al.  Pedestrian detection: A benchmark , 2009, CVPR.

[17]  Michael J. Black,et al.  Detailed Human Shape and Pose from Images , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Bernt Schiele,et al.  Pictorial structures revisited: People detection and articulated pose estimation , 2009, CVPR.

[19]  Alberto Broggi,et al.  Model-based validation approaches and matching techniques for automotive vision based pedestrian detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops.

[20]  Subhransu Maji,et al.  Classification using intersection kernel support vector machines is efficient , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Michael Goesele,et al.  Back to the Future: Learning Shape Models from 3D CAD Data , 2010, BMVC.

[22]  Michael J. Black,et al.  A 2D Human Body Model Dressed in Eigen Clothing , 2010, ECCV.

[23]  Rodney A. Brooks,et al.  The ACRONYM Model-Based Vision System , 1979, IJCAI.

[24]  David Vázquez,et al.  Learning appearance in virtual scenarios for pedestrian detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[25]  Trevor Darrell,et al.  Inferring 3D structure with a statistical image-based shape model , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[26]  Bernt Schiele,et al.  Monocular 3D pose estimation and tracking by detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[27]  Cordelia Schmid,et al.  Viewpoint-independent object class detection using 3D Feature Maps , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Dariu Gavrila,et al.  A mixed generative-discriminative framework for pedestrian classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.