Nonlinear feature selection on attributed networks

Abstract The acceleratinnsional nodal attributes in various data mining tasks highlights the significance of feature selection on the networked data. Due to the lack of class labels of nodes, many feature selection methods are proposed in semi-supervised or unsupervised manners in various scenarios instead of supervised ones. More often than not, features and (pseudo) labels are correlated in a nonlinear way that is more intricate than linearity. In these circumstances, the vast majority of existing linear algorithms could not work well since they select features according to how well the feature can linearly explain the variance of labels. Moreover, although some methods focus on nonlinear feature selection, with the neglect of the link relations between data, they are difficult to be applied to attributed networks directly. In this paper, we investigate how to achieve nonlinear feature selection on attributed networks with the help of both labeled and unlabeled data. Methodologically, we first propose a novel semi-supervised nonlinear framework FS-GCN based on graph convolutional networks (GCNs) to select high-quality features, which can elaborately catch the nonlinear dependency between nodal attributes and class labels. To verify the importance of nonlinearity precisely, we further explore the possibility of totally removing the label information so that a variant of FS-GCN is proposed in the unsupervised form, referred to as UFS-GCN. Besides, experimental results on several real-world datasets validate the superiority of FS-GCN as well as UFS-GCN in terms of the quality of selected features, suggesting their robustness in the condition of extremely low even zero label ratio.

[1]  Zhimin Wang,et al.  SGL-RFS: Semi-Supervised Graph Learning Robust Feature Selection , 2018, 2018 International Conference on Wavelet Analysis and Pattern Recognition (ICWAPR).

[2]  Qinghua Zheng,et al.  Adaptive Unsupervised Feature Selection With Structure Regularization , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[3]  Jie Gui,et al.  Multi-view Feature Selection for Heterogeneous Face Recognition , 2018, 2018 IEEE International Conference on Data Mining (ICDM).

[4]  Chengqi Zhang,et al.  Attributed network embedding via subspace discovery , 2019, Data Mining and Knowledge Discovery.

[5]  Jie Li,et al.  Unsupervised Semantic-Preserving Adversarial Hashing for Image Search , 2019, IEEE Transactions on Image Processing.

[6]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[7]  Hong Shi,et al.  Semi-supervised Feature Selection Based on Least Square Regression with Redundancy Minimization , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[8]  Chen Wang,et al.  Discriminative Semi-Supervised Feature Selection via Rescaled Least Squares Regression-Supplement , 2018, AAAI.

[9]  Pablo Castells,et al.  Enhancing structural diversity in social networks by recommending weak ties , 2018, RecSys.

[10]  Jie Gui,et al.  R 2 SDH: Robust Rotated Supervised Discrete Hashing , 2018, KDD.

[11]  David G. Stork,et al.  Pattern Classification , 1973 .

[12]  Siwei Feng,et al.  Graph Autoencoder-Based Unsupervised Feature Selection with Broad and Local Data Structure Preservation , 2018, Neurocomputing.

[13]  Jie Gui,et al.  Feature extraction using orthogonal discriminant local tangent space alignment , 2011, Pattern Analysis and Applications.

[14]  Feiping Nie,et al.  Efficient and Robust Feature Selection via Joint ℓ2, 1-Norms Minimization , 2010, NIPS.

[15]  Le Song,et al.  Feature Selection via Dependence Maximization , 2012, J. Mach. Learn. Res..

[16]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[17]  Huan Liu,et al.  Robust Unsupervised Feature Selection on Networked Data , 2016, SDM.

[18]  Huan Liu,et al.  Spectral feature selection for supervised and unsupervised learning , 2007, ICML '07.

[19]  Philip S. Yu,et al.  Nonlinear Joint Unsupervised Feature Selection , 2016, SDM.

[20]  Huan Liu,et al.  Unsupervised feature selection for linked social media data , 2012, KDD.

[21]  Zhu-Hong You,et al.  A Novel Hybrid Method of Gene Selection and Its Application on Tumor Classification , 2008, ICIC.

[22]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[23]  Jiawei Han,et al.  Towards feature selection in network , 2011, CIKM '11.

[24]  Jie Gui,et al.  Factor Analysis for Cross-Platform Tumor Classification Based on Gene Expression Profiles , 2010, J. Circuits Syst. Comput..

[25]  Hanghang Tong,et al.  FINAL: Fast Attributed Network Alignment , 2016, KDD.

[26]  Kevin Chen-Chuan Chang,et al.  Learning Community Embedding with Community Detection and Node Embedding on Graphs , 2017, CIKM.

[27]  Huan Liu,et al.  Embedded Unsupervised Feature Selection , 2015, AAAI.

[28]  Kewei Cheng,et al.  Feature Selection , 2016, ACM Comput. Surv..

[29]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[30]  Ralph Grishman,et al.  Graph Convolutional Networks With Argument-Aware Pooling for Event Detection , 2018, AAAI.

[31]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[32]  Tieniu Tan,et al.  Feature Selection Based on Structured Sparsity: A Comprehensive Study , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[33]  Qinghua Zheng,et al.  An Adaptive Semisupervised Feature Analysis for Video Semantic Recognition , 2018, IEEE Transactions on Cybernetics.

[34]  Deng Cai,et al.  Laplacian Score for Feature Selection , 2005, NIPS.

[35]  Chao Li,et al.  Shared Predictive Cross-Modal Deep Quantization , 2018, IEEE Transactions on Neural Networks and Learning Systems.