Professional tennis player ranking strategy based Monte Carlo feature selection

Extracting significant features from high-dimensional and small sample-size microarray data is a challenging problem. Other than wrapper or filter methods, we propose a novel feature selection algorithm which integrates the ideas of professional tennis players ranking, such as seed players and dynamic ranking with Monte Carlo simulation. Seed players make the ‘game’ more competitive and selective, hence improve the selection efficiency. Besides, the ranks of features are dynamically updated and this ensures that it is always the current best players to take part in each competitions. The proposed algorithm is tested on widely used public datasets. Results demonstrate that the proposed method comparatively converges faster, more stable and has good performance in classification and therefore is an efficient algorithm for feature selection.

[1]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[2]  Zengyou He,et al.  Stable Feature Selection for Biomarker Discovery , 2010, Comput. Biol. Chem..

[3]  Yungho Leu,et al.  A novel hybrid feature selection method for microarray data analysis , 2011, Appl. Soft Comput..

[4]  Igor V. Tetko,et al.  Optimization models for cancer classification: extracting gene interaction information from microarray expression data , 2004, Bioinform..

[5]  Jan Komorowski,et al.  BIOINFORMATICS ORIGINAL PAPER doi:10.1093/bioinformatics/btm486 Data and text mining Monte Carlo , 2022 .

[6]  P. Pardalos,et al.  Classification and Characterization of Gene Expression Data with Generalized Eigenvalues , 2009 .

[7]  Zheng Chen,et al.  Using Gene Ontology to Enhance Effectiveness of Similarity Measures for Microarray Data , 2008, 2008 IEEE International Conference on Bioinformatics and Biomedicine.

[8]  Sherry Y. Chen,et al.  Identifying user preferences with Wrapper-based Decision Trees , 2011, Expert Syst. Appl..

[9]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[10]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[11]  Sounak Chakraborty,et al.  Computational Statistics and Data Analysis Simultaneous Cancer Classification and Gene Selection with Bayesian Nearest Neighbor Method: an Integrated Approach , 2022 .

[12]  J. Lancaster,et al.  Integration of Clinical Information and Gene Expression Profiles for Prediction of Chemo-Response for Ovarian Cancer , 2005, 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference.

[13]  Huan Liu,et al.  Efficient Feature Selection via Analysis of Relevance and Redundancy , 2004, J. Mach. Learn. Res..

[14]  Hui-Huang Hsu,et al.  Hybrid feature selection by combining filters and wrappers , 2011, Expert Syst. Appl..

[15]  Mineichi Kudo,et al.  Comparison of algorithms that select features for pattern classifiers , 2000, Pattern Recognit..

[16]  E. Lander,et al.  Gene expression correlates of clinical prostate cancer behavior. , 2002, Cancer cell.