Multimedia LEGO: Learning Structured Model by Probabilistic Logic Ontology Tree

Recent advances in Multimedia research have generated a large collection of concept models, e.g., LSCOM and Media mill 101, which become accessible to other researchers. While most current research effort still focuses on building new concepts from scratch, little effort has been made on constructing new concepts upon the existing models already in the warehouse. To address this issue, we develop a new framework in this paper, termed LEGO, to seamlessly integrate both the new target training examples and the existing primitive concept models. LEGO treats the primitive concept models as a lego toy to potentially construct an unlimited vocabulary of new concepts. Specifically, LEGO first formulates the logic operations to be the lego connectors to combine existing concept models hierarchically in probabilistic logic ontology trees. LEGO then simultaneously incorporates new target training information to efficiently disambiguate the underlying logic tree and correct the error propagation. We present extensive experimental results on a large vehicle domain data set from Image Net, and demonstrate significantly superior performance over existing state-of-the-art approaches which build new concept models from scratch.

[1]  Tao Mei,et al.  Correlative multi-label video annotation , 2007, ACM Multimedia.

[2]  Zhen Li,et al.  Hierarchical Gaussianization for image classification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[3]  Jiayu Zhou,et al.  Clustered Multi-Task Learning Via Alternating Structure Optimization , 2011, NIPS.

[4]  Thomas Hofmann,et al.  Hierarchical document categorization with support vector machines , 2004, CIKM '04.

[5]  Thomas S. Huang,et al.  Hierarchical image feature extraction and classification , 2010, ACM Multimedia.

[6]  Cordelia Schmid,et al.  Semantic Hierarchies for Visual Object Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[8]  Yi Wu,et al.  Ontology-based multi-classification learning for video concept detection , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[9]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[10]  Vinod Nair,et al.  Learning hierarchical similarity metrics , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[12]  Marcel Worring,et al.  The challenge problem for automated detection of 101 semantic concepts in multimedia , 2006, MM '06.

[13]  Koby Crammer,et al.  On the Algorithmic Implementation of Multiclass Kernel-based Vector Machines , 2002, J. Mach. Learn. Res..

[14]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[15]  Stanley F. Chen,et al.  A Gaussian Prior for Smoothing Maximum Entropy Models , 1999 .

[16]  Jianping Fan,et al.  Hierarchical classification for automatic image annotation , 2007, SIGIR.

[17]  Rong Yan,et al.  Semantic concept-based query expansion and re-ranking for multimedia retrieval , 2007, ACM Multimedia.

[18]  Jason Weston,et al.  Label Embedding Trees for Large Multi-Class Tasks , 2010, NIPS.

[19]  John R. Smith,et al.  Large-scale concept ontology for multimedia , 2006, IEEE MultiMedia.