Discriminative techniques for the recognition of complex-shaped objects

This thesis presents new techniques which enable the automatic recognition of everyday objects like chairs and ladders in images of highly cluttered scenes. Given an image, we extract information about the shape and texture properties present in small patches of the image and use that information to identify parts of the objects we are interested in. We then assemble those parts into overall hypotheses about what objects are present in the image, and where they are. Solving this problem in a general setting is one of the central problems in computer vision. The central theme of this work is that formulating object recognition as a discrimination problem can ease the burden of system design. In particular, we show that thinking of recognition in terms of discriminating between objects and clutter, rather than separately modeling the appearances of objects and clutter, can simplify the processes of extracting information from the image and identifying which parts of the image correspond with parts of objects. The bulk of this thesis is concerned with recognizing “wiry” objects in highly-cluttered images; an example problem is finding ladders in images of a messy warehouse space. Wiry objects are distinguished by a prevalence of very thin, elongated, stick-like components; examples include tables, chairs, bicycles, and desk lamps. They are difficult to recognize because they tend to lack distinctive color or texture characteristics and their appearance is not easy to describe succinctly in terms of rectangular patches of image pixels. Here, we present a set of algorithms which extends current capabilities to find wiry objects in highly cluttered images across changes in the clutter and object pose. The second part of the thesis presents a technique for extracting texture features from images in such a way that features from objects of interest are both well-clustered with each other and well-separated from the features from clutter. We present an optimization framework for automatically combining existing texture features into features that discriminate well, thus simplifying the process of tuning the parameters of the feature extraction process.

[1]  James M. Rehg,et al.  Learning a Rare Event Detection Cascade by Direct Feature Selection , 2003, NIPS.

[2]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[3]  Mário A. T. Figueiredo Adaptive Sparseness for Supervised Learning , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Rajesh P. N. Rao,et al.  An Active Vision Architecture Based on Iconic Representations , 1995, Artif. Intell..

[5]  David J. Kriegman,et al.  From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Harry Shum,et al.  Statistical Learning of Multi-view Face Detection , 2002, ECCV.

[7]  Kiralee M. Hayashi,et al.  Dynamics of Gray Matter Loss in Alzheimer's Disease , 2003, The Journal of Neuroscience.

[8]  Takeo Kanade,et al.  A Stereo Matching Algorithm with an Adaptive Window: Theory and Experiment , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Andrea Salgian,et al.  A Perceptual Grouping Hierarchy for Appearance-Based 3D Object Recognition , 1999, Comput. Vis. Image Underst..

[10]  James L. Crowley,et al.  Local appearance space for recognition of navigation landmarks , 2000, Robotics Auton. Syst..

[11]  Hélène Paugam-Moisy,et al.  A new multi-class SVM based on a uniform convergence result , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[12]  Salvatore J. Stolfo,et al.  AdaCost: Misclassification Cost-Sensitive Boosting , 1999, ICML.

[13]  Olga Veksler,et al.  Stereo Correspondence with Compact Windows via Minimum Ratio Cycle , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Alex Pentland,et al.  Probabilistic visual learning for object detection , 1995, Proceedings of IEEE International Conference on Computer Vision.

[15]  Max A. Viergever,et al.  Automatic scoliosis detection based on local centroids evaluation on moire topographic images of human backs , 2001, IEEE Transactions on Medical Imaging.

[16]  Hiroshi Murase,et al.  Visual learning and recognition of 3-d objects from appearance , 2005, International Journal of Computer Vision.

[17]  Takeo Kanade,et al.  Probabilistic modeling of local appearance and spatial relationships for object recognition , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[18]  Daniel P. Huttenlocher,et al.  A new Bayesian framework for object recognition , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[19]  Winfried A. Fellenz,et al.  A Sequential Model for Attentive Object Selection , 1994 .

[20]  Stepán Obdrzálek,et al.  Object Recognition using Local Affine Frames on Distinguished Regions , 2002, BMVC.

[21]  Andrew Blake,et al.  A probabilistic contour discriminant for object localisation , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[22]  Rong Zhang,et al.  Integrating bottom-up/top-down for object recognition by data driven Markov chain Monte Carlo , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[23]  Pietro Perona,et al.  A Probabilistic Approach to Object Recognition Using Local Photometry and Global Geometry , 1998, ECCV.

[24]  D. Geman,et al.  Efficient Focusing and Face Detection , 1998 .

[25]  Thomas G. Dietterich,et al.  Learning with Many Irrelevant Features , 1991, AAAI.

[26]  Ronen Basri,et al.  Projective alignment with regions , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[27]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[28]  Gérard G. Medioni,et al.  Extraction Of Groups For Recognition , 1994, ECCV.

[29]  David J. Kriegman,et al.  What Is the Set of Images of an Object Under All Possible Illumination Conditions? , 1998, International Journal of Computer Vision.

[30]  John Krumm Object detection with vector quantized binary features , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[31]  Gregory Dudek,et al.  Local appearance for robust object recognition , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[32]  Trevor J. Hastie,et al.  Discriminative vs Informative Learning , 1997, KDD.

[33]  William T. Freeman,et al.  Learning Low-Level Vision , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[34]  Hanspeter Mallot,et al.  A Saccadic Camera Movement System for Object Recognition , 1991 .

[35]  Ingemar J. Cox,et al.  An optimized interaction strategy for Bayesian relevance feedback , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[36]  Bianca Zadrozny,et al.  Learning and making decisions when costs and probabilities are both unknown , 2001, KDD '01.

[37]  William T. Freeman,et al.  Exploiting the generic viewpoint assumption , 1996, International Journal of Computer Vision.

[38]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[39]  David A. Forsyth,et al.  Body plans , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[40]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[41]  David A. Forsyth,et al.  Finding objects by grouping primitives , 1998, Conference Record of Thirty-Second Asilomar Conference on Signals, Systems and Computers (Cat. No.98CH36284).

[42]  Bernt Schiele,et al.  Probabilistic object recognition using multidimensional receptive field histograms , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[43]  Stan Z. Li,et al.  A Two-Stage Probabilistic Approach for Object Recognition , 1998, ECCV.

[44]  David G. Lowe,et al.  Perceptual Organization and Visual Recognition , 2012 .

[45]  Ron Kohavi,et al.  Data Mining Using MLC a Machine Learning Library in C++ , 1996, Int. J. Artif. Intell. Tools.

[46]  Bartlett W. Mel SEEMORE: Combining Color, Shape, and Texture Histogramming in a Neurally Inspired Approach to Visual Object Recognition , 1997, Neural Computation.

[47]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[48]  Ron Kohavi,et al.  The Case against Accuracy Estimation for Comparing Induction Algorithms , 1998, ICML.

[49]  Yehezkel Lamdan,et al.  Geometric Hashing: A General And Efficient Model-based Recognition Scheme , 1988, [1988 Proceedings] Second International Conference on Computer Vision.

[50]  Jitendra Malik,et al.  Learning to Detect Natural Image Boundaries Using Brightness and Texture , 2002, NIPS.

[51]  Takeo Kanade,et al.  Neural Network-Based Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[52]  Daniel Keren,et al.  Antifaces: A Novel, Fast Method for Image Detection , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[53]  João Gama,et al.  Cascade Generalization , 2000, Machine Learning.

[54]  David J. Fleet,et al.  Performance of optical flow techniques , 1994, International Journal of Computer Vision.

[55]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[56]  Andrew E. Johnson,et al.  Using Spin Images for Efficient Object Recognition in Cluttered 3D Scenes , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[57]  Andrew Zisserman,et al.  Viewpoint invariant texture matching and wide baseline stereo , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[58]  W. Eric L. Grimson,et al.  On the sensitivity of geometric hashing , 1990, [1990] Proceedings Third International Conference on Computer Vision.

[59]  Ron Kohavi,et al.  Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[60]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[61]  Katsushi Ikeuchi,et al.  Picking up an Object from a Pile of Objects. , 1983 .

[62]  Edward H. Adelson,et al.  The Design and Use of Steerable Filters , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[63]  Martial Hebert,et al.  Object recognition using boosted discriminants , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[64]  Anuj Srivastava,et al.  Optimal linear representations of images for object recognition , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[65]  Luc Van Gool,et al.  Affine/ Photometric Invariants for Planar Intensity Patterns , 1996, ECCV.

[66]  Robert E. Schapire,et al.  A Brief Introduction to Boosting , 1999, IJCAI.

[67]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[68]  Yoshua Bengio,et al.  Object Recognition with Gradient-Based Learning , 1999, Shape, Contour and Grouping in Computer Vision.

[69]  Mark R. Wade,et al.  Construction and Assessment of Classification Rules , 1999, Technometrics.

[70]  Azriel Rosenfeld,et al.  Face recognition: A literature survey , 2003, CSUR.

[71]  Yali Amit,et al.  Joint Induction of Shape Features and Tree Classifiers , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[72]  James P. Egan,et al.  Signal detection theory and ROC analysis , 1975 .

[73]  Ronald L. Rivest,et al.  Learning decision lists , 2004, Machine Learning.

[74]  Bernt Schiele,et al.  Object Recognition Using Multidimensional Receptive Field Histograms , 1996, ECCV.

[75]  Martin I. Sereno,et al.  Learning the Solution to the Aperture Problem for Pattern Motion with a Hebb Rule , 1988, NIPS.

[76]  W. Eric L. Grimson,et al.  Similarity templates for detection and recognition , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[77]  Christian Lebiere,et al.  The Cascade-Correlation Learning Architecture , 1989, NIPS.

[78]  Regunathan Radhakrishnan,et al.  Video Summarization Using Mpeg-7 Motion Activity and Audio Descriptors , 2003 .

[79]  Jezekiel Ben-Arie,et al.  Iconic recognition with affine-invariant spectral signatures , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[80]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[81]  Michael E. Tipping The Relevance Vector Machine , 1999, NIPS.

[82]  Remco C. Veltkamp,et al.  Content-based image retrieval systems: A survey , 2000 .

[83]  Isaac Weiss,et al.  Model-Based Recognition of 3D Objects from Single Images , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[84]  Andreas Buja,et al.  Data mining criteria for tree-based regression and classification , 2001, KDD '01.

[85]  Cordelia Schmid,et al.  Combining greyvalue invariants with local constraints for object recognition , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[86]  S. Feig Breast Cancer Screening: Potential Role of Computer-Aided Detection (CAD) , 2002, Technology in cancer research & treatment.

[87]  Jianbo Shi,et al.  Object-Specific Figure-Ground Segregation , 2003, CVPR.

[88]  Paul A. Viola,et al.  Alignment by Maximization of Mutual Information , 1997, International Journal of Computer Vision.

[89]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[90]  David G. Lowe,et al.  Indexing without Invariants in 3D Object Recognition , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[91]  William E. Higgins,et al.  Designing multiple Gabor filters for multitexture image segmentation , 1999 .

[92]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[93]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[94]  I. Kononenko,et al.  INDUCTION OF DECISION TREES USING RELIEFF , 1995 .

[95]  Rajesh P. N. Rao,et al.  Eye Movements in Visual Cognition: A Computational Study , 1997 .

[96]  P. Taylor,et al.  Measuring image texture to separate "difficult" from "easy" mammograms. , 1994, The British journal of radiology.

[97]  Jason Weston,et al.  Support vector machines for multi-class pattern recognition , 1999, ESANN.

[98]  Jean Ponce,et al.  Probabilistic 3D Object Recognition , 2004, International Journal of Computer Vision.

[99]  Peter D. Turney Types of Cost in Inductive Concept Learning , 2002, ArXiv.

[100]  Trygve Randen,et al.  Texture segmentation using filters with optimized energy separation , 1999, IEEE Trans. Image Process..

[101]  David G. Lowe,et al.  Vista: a software environment for computer vision research , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[102]  Paul A. Viola,et al.  A cluster-based statistical model for object detection , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[103]  Christos Faloutsos,et al.  QBIC project: querying images by content, using color, texture, and shape , 1993, Electronic Imaging.

[104]  Charles Elkan,et al.  The Foundations of Cost-Sensitive Learning , 2001, IJCAI.

[105]  Takeo Kanade,et al.  Object Detection Using the Statistics of Parts , 2004, International Journal of Computer Vision.

[106]  Anuj Mohan Object Detection in Images by Components , 1999 .

[107]  Luhong Liang,et al.  A detector tree of boosted classifiers for real-time object detection and tracking , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[108]  Nuno Vasconcelos Feature selection by maximum marginal diversity: optimality and implications for visual recognition , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[109]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[110]  William Grimson,et al.  Object recognition by computer - the role of geometric constraints , 1991 .

[111]  Donald Geman,et al.  An Active Testing Model for Tracking Roads in Satellite Images , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[112]  Cordelia Schmid,et al.  An Affine Invariant Interest Point Detector , 2002, ECCV.

[113]  Martial Hebert, Co-chair , 2002 .

[114]  James L. Crowley,et al.  Visual Recognition Using Local Appearance , 1998, ECCV.

[115]  Paul A. Viola,et al.  Robust Real-time Object Detection , 2001 .

[116]  Glyn W. Humphreys,et al.  Cascade processes in picture identification , 1988 .

[117]  Andrew Blake,et al.  Spatial Dependence in the Observation of Visual Contours , 1998, ECCV.

[118]  Glenn Healey,et al.  The Illumination-Invariant Recognition of 3D Objects Using Local Color Invariants , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[119]  Tomaso A. Poggio,et al.  A general framework for object detection , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[120]  Katsushi Ikeuchi,et al.  Detectability, Uniqueness, and Reliability of Eigen Windows for Stable Verification of Partially Occluded Objects , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[121]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[122]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[123]  Christopher C. Pack,et al.  Temporal dynamics of a neural solution to the aperture problem in visual area MT of macaque brain , 2001, Nature.

[124]  Carla E. Brodley,et al.  Pruning Decision Trees with Misclassification Costs , 1998, ECML.