Discover Novel Visual Categories From Dynamic Hierarchies Using Multimodal Attributes

Learning novel visual categories from observations and experiences in unexplored environment is a vitally important cognitive ability for human beings. A dynamic category hierarchy that is an inherent structure in a human mind is a key component for this ability. This paper develops a framework to build dynamic category hierarchy based on object attributes and a topic model. Since humans trend to utilize multimodal information to learn novel categories, we also develop an algorithm to learn multimodal object attributes from multimodal data. The new multimodal attributes can describe objects efficiently and can generalize from learned categories to novel ones. By comparison with a state-of-the-art unimodal attribute, the multimodal attributes can achieve 4%-19% improvements on average. We also develop a constrained topic model, which can accurately construct category hierarchies for large-scale categories. Based on them, the novel framework can effectively detect novel categories and relate them with known categories for further category learning. Extensive experiments are conducted using a public multimodal dataset, i.e., color and point cloud data, to evaluate the multimodal attributes and the dynamic category hierarchy. The experimental results show the effectiveness of multimodal attributes to describe objects and the satisfactory performance of the dynamic category hierarchy to discover novel categories. By comparison with state-of-the-art methods, the dynamic category hierarchy achieves 7% improvements.

[1]  Aristidis Likas,et al.  The Global Kernel $k$-Means Algorithm for Clustering in Feature Space , 2009, IEEE Transactions on Neural Networks.

[2]  Thomas L. Griffiths,et al.  The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies , 2007, JACM.

[3]  Keiji Tanaka,et al.  Matching Categorical Object Representations in Inferior Temporal Cortex of Man and Monkey , 2008, Neuron.

[4]  Warren B. Powell,et al.  Dirichlet Process Mixtures of Generalized Linear Models , 2009, J. Mach. Learn. Res..

[5]  Ali Farhadi,et al.  Describing objects by their attributes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[7]  Cordelia Schmid,et al.  Constructing Category Hierarchies for Visual Recognition , 2008, ECCV.

[8]  Bernt Schiele,et al.  What helps where – and why? Semantic relatedness for knowledge transfer , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  Fei-Fei Li,et al.  Building and using a semantivisual image hierarchy , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10]  Dieter Fox,et al.  A large-scale hierarchical multi-view RGB-D object dataset , 2011, 2011 IEEE International Conference on Robotics and Automation.

[11]  Cordelia Schmid,et al.  Semantic Hierarchies for Visual Object Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Frédéric Gosselin,et al.  Why do we SLIP to the basic level? Computational constraints and their implementation , 2001 .

[13]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[14]  Dietmar Bruckner,et al.  Hierarchical Semantic Processing Architecture for Smart Sensors in Surveillance Networks , 2012, IEEE Transactions on Industrial Informatics.

[15]  Christoph H. Lampert,et al.  Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Daphna Weinshall,et al.  Exploiting Object Hierarchy: Combining Models from Different Category Levels , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[17]  Giovanni Muscato,et al.  3-D Integration of Robot Vision and Laser Data With Semiautomatic Calibration in Augmented Reality Stereoscopic Visual Interface , 2012, IEEE Transactions on Industrial Informatics.

[18]  Gregory L. Murphy,et al.  Hierarchical structure in concepts and the basic level of categorization. , 1997 .

[19]  Heiko Hoffmann,et al.  Kernel PCA for novelty detection , 2007, Pattern Recognit..

[20]  Mauro Birattari,et al.  Fault detection in autonomous robots based on fault injection and learning , 2008, Auton. Robots.

[21]  Le Song,et al.  Relative Novelty Detection , 2009, AISTATS.

[22]  J. F. Bradshaw,et al.  The principal axes transformation--a method for image registration. , 1990, Journal of nuclear medicine : official publication, Society of Nuclear Medicine.

[23]  Junqiang Xi,et al.  Self‐supervised learning to visually detect terrain surfaces for autonomous robots operating in forested terrain , 2012, J. Field Robotics.

[24]  Keiji Tanaka,et al.  Object category structure in response patterns of neuronal population in monkey inferior temporal cortex. , 2007, Journal of neurophysiology.

[25]  Jianwei Zhang,et al.  A Hierarchical Model Incorporating Segmented Regions and Pixel Descriptors for Video Background Subtraction , 2012, IEEE Transactions on Industrial Informatics.

[26]  Geoffrey E. Hinton,et al.  Zero-shot Learning with Semantic Output Codes , 2009, NIPS.

[27]  Marcel Körtgen,et al.  3D Shape Matching with 3D Shape Contexts , 2003 .

[28]  Alexei A. Efros,et al.  Unsupervised discovery of visual object class hierarchies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Nico Blodow,et al.  Towards 3D Point cloud based object maps for household environments , 2008, Robotics Auton. Syst..

[30]  Maureen A. Callanan,et al.  How Parents Label Objects for Young Children: The Role of Input in the Acquisition of Category Hierarchies. , 1985 .

[31]  Yong Jae Lee,et al.  Object-Graphs for Context-Aware Visual Category Discovery , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Christoph H. Lampert,et al.  Unsupervised Object Discovery: A Comparison , 2010, International Journal of Computer Vision.