论文信息 - Perceptual multi-channel visual feature fusion for scene categorization

Perceptual multi-channel visual feature fusion for scene categorization

Abstract Effectively recognizing sceneries from a variety of categories is an indispensable but challenging technique in computer vision and intelligent systems. In this work, we propose a novel image kernel based on human gaze shifting, aiming at discovering the mechanism of humans perceiving visually/semantically salient regions within a scenery. More specifically, we first design a weakly supervised embedding algorithm which projects the local image features (i.e., graphlets in this work) onto the pre-defined semantic space. Thereby, we describe each graphlet by multiple visual features at both low-level and high-level. It is generally acknowledged that humans attend to only a few regions within a scenery. Thus we formulate a sparsity-constrained graphlet ranking algorithm which incorporates visual clues at both the low-level and the high-level. According to human visual perception, these top-ranked graphlets are either visually or semantically salient. We sequentially connect them into a path which mimics human gaze shifting. Lastly, a so-called gaze shifting kernel (GSK) is calculated based on the learned paths from a collection of scene images. And a kernel SVM is employed for calculating the scene categories. Comprehensive experiments on a series of well-known scene image sets shown the competitiveness and robustness of our GSK. We also demonstrated the high consistency of the predicted path with real human gaze shifting path.

[1] Jean-Marc Odobez,et al. Gaze estimation from multimodal Kinect data , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[2] Yi Yang,et al. Weakly Supervised Photo Cropping , 2014, IEEE Transactions on Multimedia.

[3] Tat-Seng Chua,et al. Learning from Multiple Social Networks , 2016, Synthesis Lectures on Information Concepts, Retrieval, and Services.

[4] Yue Gao,et al. Feature Correlation Hypergraph: Exploiting High-order Potentials for Multimodal Recognition , 2014, IEEE Transactions on Cybernetics.

[5] Fu Li,et al. A pruning method of refining recursive reduced least squares support vector regression , 2015, Inf. Sci..

[6] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[7] Jean-Marc Odobez,et al. Person independent 3D gaze estimation from remote RGB-D cameras , 2013, 2013 IEEE International Conference on Image Processing.

[8] Takahiro Okabe,et al. A Head Pose-free Approach for Appearance-based Gaze Estimation , 2011, BMVC.

[9] Yuanyuan Wang,et al. A rough margin based support vector machine , 2008, Inf. Sci..

[10] Shaoning Pang,et al. Personalized mode transductive spanning SVM classification tree , 2011, Inf. Sci..

[11] Xuelong Li,et al. Actively Learning Human Gaze Shifting Paths for Semantics-Aware Photo Cropping , 2014, IEEE Transactions on Image Processing.

[12] Rathinasamy Sakthivel,et al. Design of state estimator for bidirectional associative memory neural networks with leakage delays , 2015, Inf. Sci..

[13] Qinghua Hu,et al. Support function machine for set-based classification with application to water quality evaluation , 2017, Inf. Sci..

[14] Jian Sun,et al. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2015, IEEE Trans. Pattern Anal. Mach. Intell..

[15] Luming Zhang,et al. Multiple Social Network Learning and Its Application in Volunteerism Tendency Prediction , 2015, SIGIR.

[16] Fei Gao,et al. Deep Multimodal Distance Metric Learning Using Click Constraints for Image Ranking , 2017, IEEE Transactions on Cybernetics.

[17] Laurent Itti,et al. Ieee Transactions on Pattern Analysis and Machine Intelligence 1 Rapid Biologically-inspired Scene Classification Using Features Shared with Visual Attention , 2022 .

[18] Tsuhan Chen,et al. Determining Patch Saliency Using Low-Level Context , 2008, ECCV.

[19] Xuelong Li,et al. Fusion of Multichannel Local and Global Structural Cues for Photo Aesthetics Evaluation , 2014, IEEE Transactions on Image Processing.

[20] Yi Yang,et al. Discovering Discriminative Graphlets for Aerial Image Categories Recognition , 2013, IEEE Transactions on Image Processing.

[21] Mohan M. Trivedi,et al. Head Pose Estimation in Computer Vision: A Survey , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22] Nuno Vasconcelos,et al. Integrated learning of saliency, complex features, and object detectors from cluttered scenes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[23] Antonio Torralba,et al. Top-down control of visual attention in object detection , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[24] Xinjun Peng,et al. A nu-twin support vector machine (nu-TSVM) classifier and its geometric algorithms , 2010, Inf. Sci..

[25] Xiao Liu,et al. Semi-supervised Node Splitting for Random Forest Construction , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[26] Chun Chen,et al. Active Learning Based on Locally Linear Reconstruction , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27] Hao Su,et al. Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification , 2010, NIPS.

[28] Gavin Brown,et al. Random Ordinality Ensembles: Ensemble methods for multi-valued categorical data , 2015, Inf. Sci..

[29] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[30] José M. Merigó,et al. The optimal group continuous logarithm compatibility measure for interval multiplicative preference relations based on the COWGA operator , 2016, Inf. Sci..

[31] Takahiro Ishikawa,et al. Passive driver gaze tracking with active appearance models , 2004 .

[32] Ludwik Kurz,et al. A New Approach to the Detection of Moving Objects , 1997, Inf. Sci..

[33] Yue Gao,et al. Representative Discovery of Structure Cues for Weakly-Supervised Image Segmentation , 2014, IEEE Transactions on Multimedia.

[34] Fei-Fei Li,et al. Combining randomization and discrimination for fine-grained image categorization , 2011, CVPR 2011.

[35] Yong Yu,et al. Robust Subspace Segmentation by Low-Rank Representation , 2010, ICML.

[36] Ling Shao,et al. Performance evaluation of deep feature learning for RGB-D image/video classification , 2017, Inf. Sci..

[37] Yin-Fu Huang,et al. Integrating frequent pattern clustering and branch-and-bound approaches for data partitioning , 2016, Inf. Sci..

[38] Xiao Liu,et al. Probabilistic Graphlet Transfer for Photo Cropping , 2013, IEEE Transactions on Image Processing.

[39] Nuno Vasconcelos,et al. Discriminant Saliency for Visual Recognition from Cluttered Scenes , 2004, NIPS.

[40] Jianping Fan,et al. iPrivacy: Image Privacy Protection by Identifying Sensitive Objects via Deep Multi-Task Learning , 2017, IEEE Transactions on Information Forensics and Security.

[41] Xuelong Li,et al. Image Categorization by Learning a Propagated Graphlet Path , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[42] Xiao Liu,et al. Probabilistic Graphlet Cut: Exploiting Spatial Structure Cue for Weakly Supervised Image Segmentation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[43] Ben M. Chen,et al. Identification of stock market forces in the system adaptation framework , 2014, Inf. Sci..

[44] Markus A. Stricker,et al. Similarity of color images , 1995, Electronic Imaging.

[45] Christof Koch,et al. Image Signature: Highlighting Sparse Salient Regions , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46] Yi Yang,et al. A Probabilistic Associative Model for Segmenting Weakly Supervised Images , 2014, IEEE Transactions on Image Processing.

[47] Yi Yang,et al. Beyond Doctors: Future Health Prediction from Multimedia and Multimodal Observations , 2015, ACM Multimedia.

[48] Xiang Zhang,et al. OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[49] Yihong Gong,et al. Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[50] Vassilios Petridis,et al. A lattice-based neuro-computing methodology for real-time human action recognition , 2011, Inf. Sci..

[51] Xinjun Peng,et al. Building sparse twin support vector machine classifiers in primal space , 2011, Inf. Sci..

[52] Frédéric Jurie,et al. Learning Saliency Maps for Object Categorization , 2006 .

[53] Pascal Fua,et al. SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[54] Ernesto Damiani,et al. A Retinex model based on Absorbing Markov Chains , 2016, Inf. Sci..

[55] Yasuo Kuniyoshi,et al. Discriminative spatial pyramid , 2011, CVPR 2011.

[56] Jin Li,et al. Identity-based chameleon hashing and signatures without key exposure , 2014, Inf. Sci..

[57] Zaïd Harchaoui,et al. Image Classification with Segmentation Graph Kernels , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[58] Yihong Gong,et al. Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[59] Meng Wang,et al. Oracle in Image Search: A Content-Based Approach to Performance Prediction , 2012, TOIS.

[60] Emmanuel J. Candès,et al. A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[61] Cordelia Schmid,et al. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[62] Xiaofei He,et al. Locality Preserving Projections , 2003, NIPS.

[63] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[64] Feiping Nie,et al. Embedding new data points for manifold learning via coordinate propagation , 2007, Knowledge and Information Systems.

[65] Yejun Xu,et al. A distance-based framework to deal with ordinal and additive inconsistencies for fuzzy reciprocal preference relations , 2016, Inf. Sci..