Unsupervised Feature Ranking via Attribute Networks

The need for learning from unlabeled data is increasing in contemporary machine learning. Methods for unsupervised feature ranking, which identify the most important features in such data are thus gaining attention, and so are their applications in studying high throughput biological experiments or user bases for recommender systems. We propose FRANe (Feature Ranking via Attribute Networks), an unsupervised algorithm capable of finding key features in given unlabeled data set. FRANe is based on ideas from network reconstruction and network analysis. FRANe performs better than state-of-the-art competitors, as we empirically demonstrate on a large collection of benchmarks. Moreover, we provide the time complexity analysis of FRANe further demonstrating its scalability. Finally, FRANe offers as the result the interpretable relational structures used to derive the feature importances.

[1]  Steve Horvath,et al.  WGCNA: an R package for weighted correlation network analysis , 2008, BMC Bioinformatics.

[2]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[3]  Lakhmi C. Jain,et al.  Feature Selection for Data and Pattern Recognition , 2014, Feature Selection for Data and Pattern Recognition.

[4]  Jure Leskovec,et al.  Supervised random walks: predicting and recommending links in social networks , 2010, WSDM '11.

[5]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[6]  G. Sanguinetti,et al.  Gene Regulatory Network Inference: An Introductory Survey. , 2018, Methods in molecular biology.

[7]  Marco Zaffalon,et al.  Time for a change: a tutorial for comparing multiple classifiers through Bayesian analysis , 2016, J. Mach. Learn. Res..

[8]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[9]  Stéphane Robin,et al.  Variational Inference for sparse network reconstruction from count data , 2018, ICML.

[10]  Michèle Sebag,et al.  Agnostic Feature Selection , 2019, ECML/PKDD.

[11]  Kewei Cheng,et al.  Feature Selection , 2016, ACM Comput. Surv..

[12]  Qinke Peng,et al.  Personalized PageRank Based Feature Selection for High-dimension Data , 2019, 2019 11th International Conference on Knowledge and Systems Engineering (KSE).

[13]  José Fco. Martínez-Trinidad,et al.  A review of unsupervised feature selection methods , 2019, Artificial Intelligence Review.

[14]  Neena Wagle,et al.  Twitter UserRank using Hadoop MapReduce , 2016, WIR '16.

[15]  Deng Cai,et al.  Laplacian Score for Feature Selection , 2005, NIPS.