Pairwise constraints based multiview features fusion for scene classification

Recently, we have witnessed a surge of interests of learning a low-dimensional subspace for scene classification. The existing methods do not perform well since they do not consider scenes' multiple features from different views in low-dimensional subspace construction. In this paper, we describe scene images by finding a group of features and explore their complementary characteristics. We consider the problem of multiview dimensionality reduction by learning a unified low-dimensional subspace to effectively fuse these features. The new proposed method takes both intraclass and interclass geometries into consideration, as a result the discriminability is effectively preserved because it takes into account neighboring samples which have different labels. Due to the semantic gap, the fusion of multiview features still cannot achieve excellent performance of scene classification in real applications. Therefore, a user labeling procedure is introduced in our approach. Initially, a query image is provided by the user, and a group of images are retrieved by a search engine. After that, users label some images in the retrieved set as relevant or irrelevant with the query. The must-links are constructed between the relevant images, and the cannot-links are built between the irrelevant images. Finally, an alternating optimization procedure is adopted to integrate the complementary nature of different views with the user labeling information, and develop a novel multiview dimensionality reduction method for scene classification. Experiments are conducted on the real-world datasets of natural scenes and indoor scenes, and the results demonstrate that the proposed method has the best performance in scene classification. In addition, the proposed method can be applied to other classification problems. The experimental results of shape classification on Caltech 256 suggest the effectiveness of our method. Highlights? We describe scene images through multimodal features. ? We explore the complementary characteristics of these features. ? We address multimodal dimensionality reduction by learning a unified low-dimensional subspace. ? We adopt alternating optimization to integrate the features with the user labeling information. ? We develop a novel multimodal dimensionality reduction method for scene classification.

[1]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[2]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[3]  Yongdong Zhang,et al.  Multiview Spectral Embedding , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[4]  Shuicheng Yan,et al.  Neighborhood preserving embedding , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[5]  Xuelong Li,et al.  Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Mahdieh Soleymani Baghshah,et al.  Metric learning for semi-supervised clustering using pairwise constraints and the geometrical structure of data , 2009, Intell. Data Anal..

[7]  Anil K. Jain,et al.  Image retrieval using color and shape , 1996, Pattern Recognit..

[8]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  Mahdieh Soleymani Baghshah,et al.  Non-linear metric learning using pairwise similarity and dissimilarity constraints and the geometrical structure of data , 2010, Pattern Recognit..

[10]  Antonio Torralba,et al.  Recognizing indoor scenes , 2009, CVPR.

[11]  Edward H. Adelson,et al.  The Design and Use of Steerable Filters , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[13]  Nicolae Vizireanu,et al.  Generalizations of binary morphological shape decomposition , 2007, J. Electronic Imaging.

[14]  Meng Wang,et al.  Adaptive Hypergraph Learning and its Application in Image Classification , 2012, IEEE Transactions on Image Processing.

[15]  Shaogang Gong,et al.  Activity based surveillance video content modelling , 2008, Pattern Recognit..

[16]  Dacheng Tao,et al.  Sparse transfer learning for interactive video search reranking , 2012, TOMCCAP.

[17]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[18]  Huan Liu,et al.  Multi-Source Feature Selection via Geometry-Dependent Covariance Analysis , 2008, FSDM.

[19]  James C. Bezdek,et al.  Some Notes on Alternating Optimization , 2002, AFSS.

[20]  Tomer Hertz,et al.  Learning Distance Functions using Equivalence Relations , 2003, ICML.

[21]  D. Donoho,et al.  Hessian Eigenmaps : new locally linear embedding techniques for high-dimensional data , 2003 .

[22]  J. Daugman Two-dimensional spectral analysis of cortical receptive field profiles , 1980, Vision Research.

[23]  Meng Wang,et al.  Optimizing multi-graph learning: towards a unified video annotation scheme , 2007, ACM Multimedia.

[24]  Jun Yu,et al.  Complex Object Correspondence Construction in Two-Dimensional Animation , 2011, IEEE Transactions on Image Processing.

[25]  Ramin Zabih,et al.  Comparing images using color coherence vectors , 1997, MULTIMEDIA '96.

[26]  Zhigang Luo,et al.  NeNMF: An Optimal Gradient Method for Nonnegative Matrix Factorization , 2012, IEEE Transactions on Signal Processing.

[27]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[28]  Paolo Remagnino,et al.  Distributed intelligence for multi-camera visual surveillance , 2004, Pattern Recognit..

[29]  Xiaojin Zhu,et al.  Introduction to Semi-Supervised Learning , 2009, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[30]  Jane You,et al.  Semi-supervised classification based on random subspace dimensionality reduction , 2012, Pattern Recognit..

[31]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[32]  Kaizhu Huang,et al.  m-SNE: Multiview Stochastic Neighbor Embedding , 2011, IEEE Trans. Syst. Man Cybern. Part B.

[33]  Michael J. Swain,et al.  Color indexing , 1991, International Journal of Computer Vision.

[34]  Hongyuan Zha,et al.  Principal Manifolds and Nonlinear Dimension Reduction via Local Tangent Space Alignment , 2002, ArXiv.

[35]  Cheng Soon Ong,et al.  Multiclass multiple kernel learning , 2007, ICML '07.

[36]  Xuelong Li,et al.  Patch Alignment for Dimensionality Reduction , 2009, IEEE Transactions on Knowledge and Data Engineering.

[37]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[38]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[39]  E. M. Wright,et al.  Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.

[40]  Jiebo Luo,et al.  Review of the State of the Art in Semantic Scene Classification , 2002 .

[41]  Wei Liu,et al.  Learning Distance Metrics with Contextual Constraints for Image Retrieval , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[42]  D. Donoho,et al.  Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[43]  Simona Halunga,et al.  Morphological skeleton decomposition interframe interpolation method , 2010, J. Electronic Imaging.

[44]  Cordelia Schmid,et al.  Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.

[45]  Ke Huang,et al.  Wavelet Feature Selection for Image Classification , 2008, IEEE Transactions on Image Processing.

[46]  Xuelong Li,et al.  Geometric Mean for Subspace Selection , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  Jun Yu,et al.  On Combining Multiple Features for Cartoon Character Retrieval and Clip Synthesis , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[48]  H. Zha,et al.  Principal manifolds and nonlinear dimensionality reduction via tangent space alignment , 2004, SIAM J. Sci. Comput..

[49]  B. S. Manjunath,et al.  Texture Features for Browsing and Retrieval of Image Data , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[50]  Anil K. Jain,et al.  Texture classification and segmentation using multiresolution simultaneous autoregressive models , 1992, Pattern Recognit..

[51]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[52]  Nicolas Le Roux,et al.  Out-of-Sample Extensions for LLE, Isomap, MDS, Eigenmaps, and Spectral Clustering , 2003, NIPS.

[53]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[54]  Ales Ude,et al.  Vision-Based Robot Path Planning , 1994 .

[55]  Meng Wang,et al.  Semisupervised Multiview Distance Metric Learning for Cartoon Synthesis , 2012, IEEE Transactions on Image Processing.

[56]  Andrew Zisserman,et al.  A Statistical Approach to Texture Classification from Single Images , 2004, International Journal of Computer Vision.

[57]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[58]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[59]  Zhigang Luo,et al.  Online Nonnegative Matrix Factorization With Robust Stochastic Approximation , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[60]  Philip S. Yu,et al.  A General Model for Multiple View Unsupervised Learning , 2008, SDM.

[61]  Hiroshi Mamitsuka,et al.  Efficient semi-supervised learning on locally informative multiple graphs , 2012, Pattern Recognit..

[62]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..