Scene Recognition via Semi-Supervised Multi-Feature Regression

With the development of visual sensor equipment (e.g., personal smart phones, vehicle cameras, surveillance videos and camcorders), scene recognition technology has attracted much attention due to its latent applications in visual surveillance, intelligent traffic and aerial remote sensing. Although some progress has been made in the field of scene recognition in recent years, the complexity of scene images and the inadequate numbers of labeled data pose challenges in this area. Hence, to effectively fuse the multiple features of each image and employ the information of both labeled and unlabeled images for scene recognition, we proposed a semi-supervised multi-feature regression (SSMFR) model in this paper. The SSMFR model possesses three advantages. First, the model propagates the labels of labeled data to unlabeled data by utilizing graph-based semi-supervised learning techniques so that both the information regarding unlabeled data and labeled data can be exploited to gain better performance. Second, SSMFR employs multiple graphs to characterize the structures of multiple feature spaces and adaptively assigns the weight to different graphs. Therefore, SSMFR can efficiently preserve the manifold structure of samples in each feature space and adequately exploit the complementary information of multiple features. Moreover, SSMFR adopts a $l_{2,1}$ -norm constraint to learn a sparse and robust classifier for scene recognition. To solve the SSMFR model, we proposed a simple and efficient iterative update optimization scheme. Finally, we also proved the convergence of SSMFR by theoretical analysis and experiments. Experiments were conducted on several benchmark scene datasets, and the experimental results demonstrated that the proposed SSMFR model can obtain better performance for scene recognition than some other state-of-the-art algorithms.

[1]  Chao Bi,et al.  Inner Product Regularized Nonnegative Self Representation for Image Classification and Clustering , 2017, IEEE Access.

[2]  Antonio Torralba,et al.  Recognizing indoor scenes , 2009, CVPR.

[3]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[4]  Shijian Lu,et al.  Discriminative Multi-modal Feature Fusion for RGBD Indoor Scene Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Tommy W. S. Chow,et al.  Soft label based Linear Discriminant Analysis for image recognition and retrieval , 2014, Comput. Vis. Image Underst..

[6]  Jan-Michael Frahm,et al.  Hierarchy of Alternating Specialists for Scene Recognition , 2018, ECCV.

[7]  Nicu Sebe,et al.  Feature Selection for Multimedia Analysis by Sharing Information Among Multiple Tasks , 2013, IEEE Transactions on Multimedia.

[8]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[9]  Jianzhong Wang,et al.  Adaptive multiple graph regularized semi-supervised extreme learning machine , 2018, Soft Comput..

[10]  Liang-Tien Chia,et al.  Scene classification using multiple features in a two-stage probabilistic classification framework , 2010, Neurocomputing.

[11]  Feiping Nie,et al.  Heterogeneous Image Features Integration via Multi-modal Semi-supervised Learning Model , 2013, 2013 IEEE International Conference on Computer Vision.

[12]  Choujun Zhan,et al.  Image classification via least square semi-supervised discriminant analysis with flexible kernel regression for out-of-sample extension , 2015, Neurocomputing.

[13]  James M. Rehg,et al.  CENTRIST: A Visual Descriptor for Scene Categorization , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Jianzhong Wang,et al.  Ordinal preserving matrix factorization for unsupervised feature selection , 2018, Signal Process. Image Commun..

[15]  Limin Wang,et al.  Locally Supervised Deep Hybrid Model for Scene Recognition. , 2017, IEEE transactions on image processing : a publication of the IEEE Signal Processing Society.

[16]  John Shawe-Taylor,et al.  Two view learning: SVM-2K, Theory and Practice , 2005, NIPS.

[17]  David Zhang,et al.  Visual Understanding via Multi-Feature Shared Learning With Global Consistency , 2015, IEEE Transactions on Multimedia.

[18]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[19]  Krista A. Ehinger,et al.  SUN database: Large-scale scene recognition from abbey to zoo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[20]  Fatih Murat Porikli,et al.  Scene Categorization with Spectral Features , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[21]  Esfandiar Zolghadr,et al.  Scene Understanding Using Context-based Conditional Random Field , 2016 .

[22]  Hao Su,et al.  Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification , 2010, NIPS.

[23]  Lizhe Wang,et al.  A semi-supervised generative framework with deep learning features for high-resolution remote sensing image scene classification , 2017, ISPRS Journal of Photogrammetry and Remote Sensing.

[24]  Xiaofeng Wu,et al.  Semi-Supervised Scene Classification for Remote Sensing Images Based on CNN and Ensemble Learning , 2018, IGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium.

[25]  Zi Huang,et al.  Multiple feature hashing for real-time large scale near-duplicate video retrieval , 2011, ACM Multimedia.

[26]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[27]  Nicu Sebe,et al.  Discriminating Joint Feature Analysis for Multimedia Data Understanding , 2012, IEEE Transactions on Multimedia.

[28]  Qionghai Dai,et al.  Local visual feature fusion via maximum margin multimodal deep neural network , 2016, Neurocomputing.

[29]  Yang Wang,et al.  Locality constrained Graph Optimization for Dimensionality Reduction , 2017, Neurocomputing.

[30]  Wei Guo,et al.  Deep Learning Scene Recognition Method Based on Localization Enhancement , 2018, Sensors.

[31]  Andrew Zisserman,et al.  Image Classification using Random Forests and Ferns , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[32]  Chris H. Q. Ding,et al.  Convex and Semi-Nonnegative Matrix Factorizations , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Nuno Vasconcelos,et al.  Deep Scene Image Classification with the MFAFVNet , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[34]  Chong Ho Lee,et al.  Scene Classification via Hypergraph-Based Semantic Attributes Subnetworks Identification , 2014, ECCV.

[35]  Xuelong Li,et al.  Multi-View Clustering and Semi-Supervised Classification with Adaptive Neighbours , 2017, AAAI.

[36]  Xinhang Song,et al.  Multi-Scale Multi-Feature Context Modeling for Scene Recognition in the Semantic Manifold , 2017, IEEE Transactions on Image Processing.

[37]  Fei-Fei Li,et al.  What, where and who? Classifying events by scene and object recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[38]  Jian Yang,et al.  Low rank representation with adaptive distance penalty for semi-supervised subspace classification , 2017, Pattern Recognit..

[39]  Jianzhong Wang,et al.  Label propagation based semi-supervised non-negative matrix factorization for feature extraction , 2015, Neurocomputing.

[40]  Jeffrey W Hoffmeister,et al.  Evaluation of breast cancer with a computer‐aided detection system by mammographic appearance and histopathology , 2005, Cancer.

[41]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[42]  Yongdong Zhang,et al.  Multiview Spectral Embedding , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[43]  Xueqi Ma,et al.  $p$ -Laplacian Regularization for Scene Recognition , 2019, IEEE Transactions on Cybernetics.

[44]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[45]  Yi Yang,et al.  Semisupervised Feature Selection via Spline Regression for Video Semantic Recognition , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[46]  Chun Chen,et al.  Relational Multimanifold Coclustering , 2013, IEEE Transactions on Cybernetics.

[47]  Yufeng Wang,et al.  Superpixel-Based Feature for Aerial Image Scene Recognition , 2018, Sensors.

[48]  Tommy W. S. Chow,et al.  A general soft label based Linear Discriminant Analysis for semi-supervised dimensionality reduction , 2014, Neural Networks.

[49]  Zhimin Wang,et al.  An adaptive spatial information-theoretic fuzzy clustering algorithm for image segmentation , 2013, Comput. Vis. Image Underst..

[50]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[51]  Luis Herranz,et al.  Scene Recognition with CNNs: Objects, Scales and Dataset Bias , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Robert Marti,et al.  Which is the best way to organize/classify images by content? , 2007, Image Vis. Comput..

[53]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[54]  Xiaoqiang Lu,et al.  Scene Recognition by Manifold Regularized Deep Learning Architecture , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[55]  Bernhard Schölkopf,et al.  Cluster Kernels for Semi-Supervised Learning , 2002, NIPS.

[56]  Chao Bi,et al.  Multicriteria-Based Active Discriminative Dictionary Learning for Scene Recognition , 2018, IEEE Access.

[57]  Chao Bi,et al.  Semi-supervised local ridge regression for local matching based face recognition , 2015, Neurocomputing.

[58]  Zhao Wang,et al.  Adaptive multi-view feature selection for human motion retrieval , 2016, Signal Process..

[59]  Matti Pietikäinen,et al.  A comparative study of texture measures with classification based on featured distributions , 1996, Pattern Recognit..

[60]  Mohammed Bennamoun,et al.  A Discriminative Representation of Convolutional Features for Indoor Scene Recognition , 2015, IEEE transactions on image processing : a publication of the IEEE Signal Processing Society.

[61]  Illah R. Nourbakhsh,et al.  Appearance-based place recognition for topological localization , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[62]  Hao Su,et al.  Objects as Attributes for Scene Classification , 2010, ECCV Workshops.

[63]  Zi Huang,et al.  Multi-Feature Fusion via Hierarchical Regression for Multimedia Analysis , 2013, IEEE Transactions on Multimedia.

[64]  David A. Forsyth,et al.  Learning Large-Scale Automatic Image Colorization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[65]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[66]  Michael R. Lyu,et al.  Bridging the Semantic Gap Between Image Contents and Tags , 2010, IEEE Transactions on Multimedia.

[67]  Luis Herranz,et al.  Joint multi-feature spatial context for scene recognition in the semantic manifold , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[68]  Quanquan Gu,et al.  Co-clustering on manifolds , 2009, KDD.

[69]  Wei Liu,et al.  Double Fusion for Multimedia Event Detection , 2012, MMM.

[70]  Feiping Nie,et al.  Efficient and Robust Feature Selection via Joint ℓ2, 1-Norms Minimization , 2010, NIPS.

[71]  Ruizhi Chen,et al.  Scene Recognition for Indoor Localization Using a Multi-Sensor Fusion Approach , 2017, Sensors.