Evaluating effectiveness of Latent Dirichlet Allocation model for scene classification

Scene classification from images is a challenging problem in computer vision due to its significant variability of scale, illumination, and view. Recently, Latent Dirichlet Allocation (LDA) model has grown popular in computer vision field, especially in scene labeling and classification. However, the effectiveness of the LDA model for the scene classification has not yet been addressed thoroughly. Especially, there is little experimental evaluation on the model's performance for different types of features. Fusion of multiple types of features is usually necessary in the scene classification due to the complexity of scene images. In this paper, we investigate the effectiveness of the LDA model in scene classification by using 7 types of features (i.e. uniform grid based interest points, Harris corner based interest points, scale invariant feature transform (SIFT), texture, shape, color, and location) and their various combinations. Furthermore, we compare the performance of the LDA model with Support Vector Machine (SVM) classifier. All experiments are performed on the UIUC Sport Scene database. The experiments demonstrate that the performance of the LDA model 1) is significantly lower than the SVM classifier for the scene classification over different types of features; and 2) decreases by fusing multiple features while improvement shown in SVM classifier.

[1]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[2]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[3]  Fei-Fei Li,et al.  Spatially Coherent Latent Topic Model for Concurrent Segmentation and Classification of Objects and Scenes , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[4]  Andrew Zisserman,et al.  Scene Classification Via pLSA , 2006, ECCV.

[5]  Alexei A. Efros,et al.  Discovering objects and their location in images , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[6]  Nuno Vasconcelos,et al.  Scene classification with low-dimensional semantic spaces and weak supervision , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[8]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[9]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[10]  Li Fei-Fei,et al.  Towards total scene understanding: Classification, annotation and segmentation in an automatic framework , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[12]  Alexei A. Efros,et al.  Using Multiple Segmentations to Discover Objects and their Extent in Image Collections , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[13]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[14]  Andrew Zisserman,et al.  Classifying Images of Materials: Achieving Viewpoint and Illumination Independence , 2002, ECCV.

[15]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[16]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.