Extended Hierarchical Gaussianization for scene classification

In this paper, we propose a novel image representation for scene classification. Firstly, we model multiple order statistics of image patches via Gaussian Mixture Model(GMM) in a Bayesian framework. Secondly, we combine the information of mean and covariance of the GMM and represent it as a mean-covariance supervector through a new distance metric. Experimental results demonstrate that our new representation, by just using nearest centroid classifier, has significantly outperformed all existing methods on the fifteen scene category database.

[1]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[2]  Chin-Hui Lee,et al.  Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains , 1994, IEEE Trans. Speech Audio Process..

[3]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[4]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[5]  Eli Shechtman,et al.  In defense of Nearest-Neighbor based image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Yixin Chen,et al.  MILES: Multiple-Instance Learning via Embedded Instance Selection , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[8]  Lixin Fan,et al.  Categorizing Nine Visual Classes using Local Appearance Descriptors , 2004 .

[9]  Peyman Milanfar,et al.  Detection of human actions from a single example , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[10]  Zhen Li,et al.  Hierarchical Gaussianization for image classification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[11]  Barbara Caputo,et al.  Recognition with local features: the kernel recipe , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[12]  Fatih Murat Porikli,et al.  Robust License Plate Detection Using Covariance Descriptor in a Neural Network Framework , 2006, 2006 IEEE International Conference on Video and Signal Based Surveillance.

[13]  William M. Campbell A covariance kernel for svm language recognition , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[14]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .