Feature weighting using a clustering approach

In recent decades, the volume and size of data has significantly increased with the growth of technology. Extracting knowledge and useful patterns in high-dimensional data are challenging. In fact, unrelated features and dimensions reduce the efficiency and increase the complexity of machine learning algorithms. However, the methods used for selecting features and weighting features are a common solution for these problems. In this study, a feature weighting approach is presented based on density-based clustering. This method has been implemented in two steps. In the first step, the features were divided into clusters using density-based clustering. In the second step, the features with a higher degree of importance were selected in accordance to the target class of each cluster. In order to evaluate the efficiency, various standard datasets were classified by the feature selection and their degree of importance. The results indicated that the simplicity and suitability of the method in the high-dimensional dataset are the main advantages of the proposed method.

[1]  Amit Kumar Das,et al.  A new hybrid feature selection approach using feature association map for supervised and unsupervised classification , 2017, Expert Syst. Appl..

[2]  Mansoor Zolghadri Jahromi,et al.  A general feature-weighting function for classification problems , 2017, Expert Syst. Appl..

[3]  Ferat Sahin,et al.  A survey on feature selection methods , 2014, Comput. Electr. Eng..

[4]  Yu Xue,et al.  A hybrid feature selection algorithm for gene expression data classification , 2017, Neurocomputing.

[5]  Xiaojin Zhu,et al.  Introduction to Semi-Supervised Learning , 2009, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[6]  Shailendra Singh,et al.  An ensemble approach for feature selection of Cyber Attack Dataset , 2009, ArXiv.

[7]  Bo Yang,et al.  A fast feature weighting algorithm of data gravitation classification , 2017, Inf. Sci..

[8]  Yijun Sun,et al.  Iterative RELIEF for Feature Weighting: Algorithms, Theories, and Applications , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[10]  Kemal Polat,et al.  Classification of Parkinson's disease using feature weighting method on the basis of fuzzy C-means clustering , 2012, Int. J. Syst. Sci..

[11]  W. Scott Spangler,et al.  Feature Weighting in k-Means Clustering , 2003, Machine Learning.

[12]  Jugal K. Kalita,et al.  MIFS-ND: A mutual information-based feature selection method , 2014, Expert Syst. Appl..

[13]  Yangyang Li,et al.  Self-representation based dual-graph regularized feature selection clustering , 2016, Neurocomputing.

[14]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[15]  Ali Aghagolzadeh,et al.  FFS: A F-DBSCAN Clustering- Based Feature Selection For Classification Data , 2017 .

[16]  Xindong Wu,et al.  Feature selection using hierarchical feature clustering , 2011, CIKM '11.