Image categorization via robust pLSA

This paper presents a novel method to give a good initial estimate of the probabilistic latent semantic analysis (pLSA) model using rival penalized competitive learning (RPCL), since the expectation maximization (EM) algorithm used to train the model is sensitive to the initialization. As a generative model from the statistical text literature, pLSA is further applied to the bag-of-words representation for each image in the database. Especially for those images containing multiple object categories (e.g. grass, roads, and buildings), we aim to discover the objects (i.e., latent topics) in an unsupervised way using pLSA. Based on the discovered topics, image categorization is then carried out by ensemble-based support vector machine (SVM). We then find in the experiments that the pLSA model with RPCL initialization followed by ensemble-based SVM categorization is robust to the changes of the visual vocabulary and the number of latent topics.

[1]  Oded Maron,et al.  Multiple-Instance Learning for Natural Scene Classification , 1998, ICML.

[2]  Zhongfei Zhang,et al.  Effective Image Retrieval Based on Hidden Concept Discovery in Image Database , 2007, IEEE Transactions on Image Processing.

[3]  James Ze Wang,et al.  SIMPLIcity: Semantics-Sensitive Integrated Matching for Picture LIbraries , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5]  Nenghai Yu,et al.  Scale-Invariant Visual Language Modeling for Object Categorization , 2009, IEEE Trans. Multim..

[6]  Motoaki Kawanabe,et al.  A procedure of adaptive kernel combination with kernel-target alignment for object classification , 2009, CIVR '09.

[7]  Ayman Farahat,et al.  Improving Probabilistic Latent Semantic Analysis with Principal Component Analysis , 2006, EACL.

[8]  Erkki Oja,et al.  Rival penalized competitive learning for clustering analysis, RBF net, and curve detection , 1993, IEEE Trans. Neural Networks.

[9]  Joachim M. Buhmann,et al.  Vector quantization with complexity costs , 1993, IEEE Trans. Inf. Theory.

[10]  Yixin Chen,et al.  Image Categorization by Learning and Reasoning with Regions , 2004, J. Mach. Learn. Res..

[11]  Jinwen Ma,et al.  A cost-function approach to rival penalized competitive learning (RPCL) , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[12]  R. Manmatha,et al.  Multiple Bernoulli relevance models for image and video annotation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[13]  Juan Carlos Niebles,et al.  Unsupervised Learning of Human Action Categories , 2006 .

[14]  Vladimir Vapnik,et al.  An overview of statistical learning theory , 1999, IEEE Trans. Neural Networks.

[15]  Lei Xu,et al.  Strip line detection and thinning by RPCL-based local PCA , 2003, Pattern Recognit. Lett..

[16]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[17]  Thomas Hofmann,et al.  Probabilistic latent semantic indexing , 1999, SIGIR '99.

[18]  Allen Gersho,et al.  Asymptotically optimal block quantization , 1979, IEEE Trans. Inf. Theory.

[19]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[20]  Anil K. Jain,et al.  Unsupervised Learning of Finite Mixture Models , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Geoffrey E. Hinton,et al.  SMEM Algorithm for Mixture Models , 1998, Neural Computation.

[22]  Zhiwu Lu,et al.  Image categorization by learning with context and consistency , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  R. Redner,et al.  Mixture densities, maximum likelihood, and the EM algorithm , 1984 .

[24]  Anil K. Jain,et al.  Image classification for content-based indexing , 2001, IEEE Trans. Image Process..

[25]  Martin Szummer,et al.  Indoor-outdoor image classification , 1998, Proceedings 1998 IEEE International Workshop on Content-Based Access of Image and Video Database.

[26]  Andrew Zisserman,et al.  Scene Classification Via pLSA , 2006, ECCV.

[27]  Luc Van Gool,et al.  Modeling scenes with local descriptors and latent aspects , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[28]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[29]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).