论文信息 - Visual Feature Learning

Visual Feature Learning

Categorization is a fundamental problem of many computer vision applications, e.g., image classification, pedestrian detection and face recognition. The robustness of a categorization system heavily relies on the quality of features, by which data are represented. The prior arts of feature extraction can be concluded in different levels, which, in a bottom up order, are low level features (e.g., pixels and gradients) and middle/high-level features (e.g., the BoW model and sparse coding). Low level features can be directly extracted from images or videos, while middle/high-level features are constructed upon low-level features, and are designed to enhance the capability of categorization systems based on different considerations (e.g., guaranteeing the domain-invariance and improving the discriminative power). This thesis focuses on the study of visual feature learning. Challenges that remain in designing visual features lie in intra-class variation, occlusions, illumination and view-point changes and insufficient prior knowledge. To address these challenges, I present several visual feature learning methods, where these methods cover the following sub-topics: (i) I start by introducing a segmentation-based object recognition system. (ii) When training data are insufficient, I seek data from other resources, which include images or videos in a different domain, actions captured from a different viewpoint and information in a different media form. In order to appropriately transfer such resources into the target categorization system, four transfer learning-based feature learning methods are presented in this section, where both cross-view, cross-domain and cross-modality scenarios are addressed accordingly. (iii) Finally, I present a random-forest based feature fusion method for multi-view action recognition.

Fan Zhu | F. Zhu | Fan Zhu

[1] Sameer A. Nene,et al. Columbia Object Image Library (COIL100) , 1996 .

[2] Narendra Ahuja,et al. Learning to recognize objects , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[3] Michael F. Young,et al. Imagery, action, and young children's spatial orientation: it's not being there that counts, it's what one has in mind. , 1994, Child development.

[4] M S Banks,et al. Sensitive period for the development of human binocular vision , 1975, Science.

[5] Stanley M. Dunn,et al. Learning Shape Classes , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[6] James L. Crowley,et al. Visual Recognition Using Local Appearance , 1998, ECCV.

[7] Yali Amit,et al. Joint Induction of Shape Features and Tree Classifiers , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[8] Keinosuke Fukunaga,et al. Application of the Karhunen-Loève Expansion to Feature Selection and Ordering , 1970, IEEE Trans. Computers.

[9] Roderic A. Grupen,et al. A control basis for learning multifingered grasps , 1997, J. Field Robotics.

[10] Roderic A. Grupen,et al. Learning in Non-stationary Conditions: A Control Theoretic Approach , 2000, ICML.

[11] Paul R. Cohen,et al. Neo: learning conceptual knowledge by sensorimotor interaction with an environment , 1997, AGENTS '97.

[12] David J. Field,et al. Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[13] Bartlett W. Mel. SEEMORE: Combining Color, Shape, and Texture Histogramming in a Neurally Inspired Approach to Visual Object Recognition , 1997, Neural Computation.

[14] James L. Crowley,et al. Probabilistic recognition of activity using local appearance , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[15] Horst Bischof,et al. Robust Recognition Using Eigenimages , 2000, Comput. Vis. Image Underst..

[16] X. Beristain. Essentials of neural science and behavior , 1996 .

[17] Donald Geman,et al. Graded Learning for Object Detection , 1999 .

[18] Hiroshi Murase,et al. Learning, positioning, and tracking visual appearance , 1994, Proceedings of the 1994 IEEE International Conference on Robotics and Automation.

[19] Vicki Bruce,et al. Face Recognition: From Theory to Applications , 1999 .

[20] Gérard Govaert,et al. Assessing a Mixture Model for Clustering with the Integrated Completed Likelihood , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[21] Roderic A. Grupen,et al. A control basis for visual servoing tasks , 1995, Proceedings of 1995 IEEE International Conference on Robotics and Automation.

[22] E. Gibson,et al. The development of perception , 1983 .

[23] J. Koenderink. The structure of images , 2004, Biological Cybernetics.

[24] J. Tanaka,et al. Object categories and expertise: Is the basic level in the eye of the beholder? , 1991, Cognitive Psychology.

[25] Peter Allen. Surface descriptions from vision and touch , 1984, ICRA.

[26] R. Manmatha,et al. Gaussian Filtered Representations of Images , 1999 .

[27] Lakhmi C. Jain,et al. Introduction to Bayesian Networks , 2008 .

[28] Rakesh Mohan,et al. Multidimensional Indexing for Recognizing Visual Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[29] Christopher M. Brown,et al. Task-oriented vision with multiple Bayes nets , 1993 .

[30] Shimon Ullman,et al. Recognizing solid objects by alignment with an image , 1990, International Journal of Computer Vision.

[31] Tomaso A. Poggio,et al. A general framework for object detection , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[32] Andrew McCallum,et al. Reinforcement learning with selective perception and hidden state , 1996 .

[33] P. Schyns,et al. Categorization creates functional features , 1997 .

[34] Hiroshi Murase,et al. Subspace methods for robot vision , 1996, IEEE Trans. Robotics Autom..

[35] Rajesh P. N. Rao,et al. Dynamic Model of Visual Recognition Predicts Neural Response Properties in the Visual Cortex , 1997, Neural Computation.

[36] Michael J. Tarr. Is human object recognition better described by geon structural description or by multiple views , 1995 .

[37] Nathan Intrator,et al. Three-Dimensional Object Recognition Using an Unsupervised BCM Network: The Usefulness of Distinguishing Features , 1993, Neural Computation.

[38] Rajesh P. N. Rao,et al. An Active Vision Architecture Based on Iconic Representations , 1995, Artif. Intell..

[39] Juyang Weng,et al. Incremental learning for vision-based navigation , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[40] Jakub Segen. Learning Graph Models of Shape , 1988, ML.

[41] R. Nelson,et al. Large-scale tests of a keyed, appearance-based 3-D object recognition system , 1998, Vision Research.

[42] Edward M. Riseman,et al. Image Retrieval Using Scale-Space Matching , 1996, ECCV.

[43] Robert L. Goldstone,et al. The development of features in object concepts , 1998, Behavioral and Brain Sciences.

[44] Alex Pentland,et al. Probabilistic Visual Learning for Object Representation , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[45] Randal C. Nelson,et al. Visual space task specification, planning and control , 1995, Proceedings of International Symposium on Computer Vision - ISCV.

[46] Dana H. Ballard,et al. Animate Vision , 1991, Artif. Intell..

[47] Sang Joon Kim,et al. A Mathematical Theory of Communication , 2006 .

[48] Bruce A. Draper,et al. ADORE: Adaptive Object Recognition , 1999, ICVS.

[49] Yali Amit,et al. Shape Quantization and Recognition with Randomized Trees , 1997, Neural Computation.

[50] E. Gibson,et al. An Ecological Approach to Perceptual Learning and Development , 2000 .

[51] Michael J. Swain,et al. Color indexing , 1991, International Journal of Computer Vision.

[52] Lucas J. van Vliet,et al. Recursive Gaussian derivative filters , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[53] Edward H. Adelson,et al. The Design and Use of Steerable Filters , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[54] Justus H. Piater,et al. Distinctive Features Should Be Learned , 2000, Biologically Motivated Computer Vision.

[55] J. Rieser,et al. Pointing at objects in other rooms: young children's sensitivity to perspective after walking with and without vision. , 1988, Child development.

[56] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .

[57] Justus H. Piater,et al. Developing haptic and visual perceptual categories for reaching and grasping with a humanoid robot , 2001, Robotics Auton. Syst..

[58] L. Acredolo,et al. Behavioral Approaches to Spatial Orientation in Infancy , 1990, Annals of the New York Academy of Sciences.

[59] James L. Crowley,et al. Local Scale Selection for Gaussian Based Description Techniques , 2000, ECCV.

[60] David G. Lowe,et al. Towards a Computational Model for Object Recognition in IT Cortex , 2000, Biologically Motivated Computer Vision.

[61] Tony Lindeberg,et al. Feature Detection with Automatic Scale Selection , 1998, International Journal of Computer Vision.

[62] J. Gibson. The Ecological Approach to Visual Perception , 1979 .

[63] Yali Amit,et al. A Computational Model for Visual Selection , 1999, Neural Computation.

[64] David Casasent,et al. GENERAL METHODOLOGY FOR SIMULTANEOUS REPRESENTATION AND DISCRIMINATION OF MULTIPLE OBJECT CLASSES , 1998 .

[65] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[66] M. Tarr. Visual Pattern Recognition , 1998 .

[67] Bernt Schiele,et al. Object Recognition Using Multidimensional Receptive Field Histograms , 1996, ECCV.

[68] Justus H. Piater,et al. Toward learning visual discrimination strategies , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[69] M. Tarr,et al. Becoming a “Greeble” Expert: Exploring Mechanisms for Face Recognition , 1997, Vision Research.

[70] T. Poggio,et al. Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.

[71] Justus H. Piater,et al. A Framework for Learning Visual Discrimination , 1999, FLAIRS.

[72] Michael J. Black,et al. EigenTracking: Robust Matching and Tracking of Articulated Objects Using a View-Based Representation , 1996, International Journal of Computer Vision.

[73] Tony Lindeberg,et al. Scale-Space Theory in Computer Vision , 1993, Lecture Notes in Computer Science.

[74] R. L. Solso,et al. Prototype formation of faces: A case of pseudo-memory , 1981 .

[75] M. Turk,et al. Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[76] Norbert Krüger,et al. ORASSYLL: Object Recognition with Autonomously Learned and Sparse Symbolic Representations Based on Local Line Detectors , 1998, BMVC.

[77] D. Geman,et al. Efficient Focusing and Face Detection , 1998 .

[78] Soheil Shams. Multiple elastic modules for visual pattern recognition , 1995, Neural Networks.

[79] F. Yates. Contributions to Mathematical Statistics , 1951, Nature.

[80] P. Schyns,et al. The Ontogeny of Part Representation in Object Concepts , 1994 .

[81] D. Hubel,et al. Receptive fields, binocular interaction and functional architecture in the cat's visual cortex , 1962, The Journal of physiology.

[82] P. Utgoff,et al. A Kolmogorov-Smirnoff Metric for Decision Tree Induction , 1996 .

[83] Gérard G. Medioni,et al. The Challenge of Generic Object Recognition , 1994, Object Representation in Computer Vision.

[84] James L. Crowley,et al. Object Recognition Using Coloured Receptive Fields , 2000, ECCV.

[85] John W. Tukey,et al. A Projection Pursuit Algorithm for Exploratory Data Analysis , 1974, IEEE Transactions on Computers.

[86] Ruzena Bajcsy,et al. Active and exploratory perception , 1992, CVGIP Image Underst..

[87] Nicholas I. Fisher,et al. Statistical Analysis of Circular Data , 1993 .

[88] Sebastian Thrun,et al. Lifelong robot learning , 1993, Robotics Auton. Syst..

[89] Paul A. Viola. Complex Feature Recognition: A Bayesian Approach for Learning to Recognize Objects , 1996 .

[90] Juyang Weng,et al. Using Discriminant Eigenfeatures for Image Retrieval , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[91] Yoshua Bengio,et al. Pattern Recognition and Neural Networks , 1995 .

[92] J. Koenderink,et al. Representation of local geometry in the visual system , 1987, Biological Cybernetics.

[93] Judea Pearl,et al. Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[94] N. Logothetis,et al. Psychophysical and physiological evidence for viewer-centered object representations in the primate. , 1995, Cerebral cortex.

[95] Rajesh P. N. Rao,et al. Embodiment is the foundation, not a level , 1996, Behavioral and Brain Sciences.

[96] Roderic A. Grupen,et al. Dynamic Control Models as State Abstractions , 1998 .

[97] David Casasent,et al. Classification and pose estimation of objects using nonlinear features , 1998, Defense, Security, and Sensing.

[98] S. Nayar,et al. Early Visual Learning , 1996 .

[99] Luc Stells,et al. Constructing and Sharing Perceptual Distiinctions , 1997, ECML.

[100] R A Young,et al. The Gaussian derivative model for spatial vision: I. Retinal mechanisms. , 1988, Spatial vision.

[101] Yiannis Aloimonos,et al. Active vision , 2004, International Journal of Computer Vision.

[102] L P Acredolo,et al. The role of self-produced movement and visual tracking in infant spatial orientation. , 1984, Journal of experimental child psychology.

[103] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[104] John F. Canny,et al. A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[105] Justus H. Piater,et al. Feature learning for recognition with Bayesian networks , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[106] W. Eric L. Grimson,et al. Model-based recognition and localization from tactile data , 1984, ICRA.

[107] Juyang Weng,et al. Vision-guided navigation using SHOSLIF , 1998, Neural Networks.

[108] Tony Lindeberg,et al. Edge Detection and Ridge Detection with Automatic Scale Selection , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[109] Ramesh C. Jain,et al. Recognizing partially visible objects using feature indexed hypotheses , 1986, IEEE J. Robotics Autom..

[110] M. E. McCarty,et al. How infants use vision for grasping objects. , 2001, Child development.

[111] Andrea Salgian,et al. A cubist approach to object recognition , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[112] Ron Kohavi,et al. Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[113] Andrew McCallum,et al. Overcoming Incomplete Perception with Utile Distinction Memory , 1993, ICML.

[114] Hiroshi Murase,et al. Visual learning and recognition of 3-d objects from appearance , 2005, International Journal of Computer Vision.

[115] Robert B. Fisher,et al. Integrating Iconic and Structured Matching , 1998, ECCV.

[116] Justus H. Piater,et al. Constructive Feature Learning and the Development of Visual Expertise , 2000, ICML.

[117] R. Manmatha,et al. Retrieving images by appearance , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[118] H. Ruff. Infant recognition of the invariant form of objects. , 1978, Child development.

[119] Luc Steels,et al. Generation and Selection of Sensory Channels , 1999, EvoWorkshops.