A heuristic biomarker selection approach based on professional tennis player ranking strategy

Extracting significant features from high-dimension and small sample size biological data is a challenging problem. Recently, Michał Draminski proposed the Monte Carlo feature selection (MC) algorithm, which was able to search over large feature spaces and achieved better classification accuracies. However in MC the information of feature rank variations is not utilized and the ranks of features are not dynamically updated. Here, we propose a novel feature selection algorithm which integrates the ideas of the professional tennis players ranking, such as seed players and dynamic ranking, into Monte Carlo simulation. Seed players make the feature selection game more competitive and selective. The strategy of dynamic ranking ensures that it is always the current best players to take part in each competition. The proposed algorithm is tested on 8 biological datasets. Results demonstrate that the proposed method is computationally efficient, stable and has favorable performance in classification.

[1]  J. Graham,et al.  Diaphanospondylodysostosis: Six new cases and exclusion of the candidate genes, PAX1 and MEOX1 , 2007, American journal of medical genetics. Part A.

[2]  Graham R. Ball,et al.  Identification of gene transcript signatures predictive for estrogen receptor and lymph node status using a stepwise forward selection artificial neural network modelling approach , 2008, Artif. Intell. Medicine.

[3]  J. Tan,et al.  Rasd1 interacts with Ear2 (Nr2f6) to regulate renin transcription , 2011, BMC Molecular Biology.

[4]  H. Handa,et al.  Structure–function analysis of human Spt4: evidence that hSpt4 and hSpt5 exert their roles in transcriptional elongation as parts of the DSIF complex , 2003, Genes to cells : devoted to molecular & cellular mechanisms.

[5]  Andrew J. Brown,et al.  The endogenous regulator 24(S),25-epoxycholesterol inhibits cholesterol synthesis at DHCR24 (Seladin-1). , 2012, Biochimica et biophysica acta.

[6]  S. Kotamraju,et al.  Expression of the hemochromatosis gene modulates the cytotoxicity of doxorubicin in breast cancer cells , 2006, International journal of cancer.

[7]  E. P. D. Oca Human β-defensin 1: A restless warrior against allergies, infections and cancer , 2010 .

[8]  林晓惠,et al.  A Support Vector Machine-Recursive Feature Elimination Feature Selection Method based on Artificial Contract Variables and Mutual Information , 2012 .

[9]  Lei Zhu,et al.  Professional tennis player ranking strategy based Monte Carlo feature selection , 2011, 2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW).

[10]  Jennifer Maynard,et al.  Structure of an autoimmune T cell receptor complexed with class II peptide-MHC: insights into MHC bias and antigen specificity. , 2005, Immunity.

[11]  R. Ferris,et al.  Loss of New Chemokine CXCL14 in Tumor Tissue Is Associated with Low Infiltration by Dendritic Cells (DC), while Restoration of Human CXCL14 Expression in Tumor Cells Causes Attraction of DC Both In Vitro and In Vivo1 , 2005, The Journal of Immunology.

[12]  Pedro Larrañaga,et al.  Filter versus wrapper gene selection approaches in DNA microarray domains , 2004, Artif. Intell. Medicine.

[13]  R. Heller,et al.  HE6, a two‐subunit heptahelical receptor associated with apical membranes of efferent and epididymal duct epithelia , 2003, Molecular reproduction and development.

[14]  P. Pardalos,et al.  Classification and Characterization of Gene Expression Data with Generalized Eigenvalues , 2009 .

[15]  S. Imaoka,et al.  CYP4B1 is a possible risk factor for bladder cancer in humans. , 2000, Biochemical and biophysical research communications.

[16]  Igor V. Tetko,et al.  Optimization models for cancer classification: extracting gene interaction information from microarray expression data , 2004, Bioinform..

[17]  F. Marshall,et al.  Human β-Defensin-1, a Potential Chromosome 8p Tumor Suppressor: Control of Transcription and Induction of Apoptosis in Renal Cell Carcinoma , 2006 .

[18]  Kaushik Mahata,et al.  Selecting differentially expressed genes using minimum probability of classification error , 2007, J. Biomed. Informatics.

[19]  Jan Komorowski,et al.  BIOINFORMATICS ORIGINAL PAPER doi:10.1093/bioinformatics/btm486 Data and text mining Monte Carlo , 2022 .

[20]  Sounak Chakraborty,et al.  Computational Statistics and Data Analysis Bayesian Binary Kernel Probit Model for Microarray Based Cancer Classification and Gene Selection , 2022 .

[21]  D. Guc,et al.  HFE H63D mutation frequency shows an increase in Turkish women with breast cancer , 2006, BMC Cancer.

[22]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[23]  Delu Zhou,et al.  Lamp-2a facilitates MHC class II presentation of cytoplasmic antigens. , 2005, Immunity.

[24]  Robert Brown,et al.  Identification of candidate epigenetic biomarkers for ovarian cancer detection. , 2009, Oncology reports.

[25]  Fuzii Ht,et al.  Transient resistance to B16F10 melanoma growth and metastasis in CD43-/- mice. , 2002 .

[26]  Miki Ohira,et al.  LMO3 interacts with neuronal transcription factor, HEN2, and acts as an oncogene in neuroblastoma. , 2005, Cancer research.

[27]  I. Nishino,et al.  LAMP‐2‐deficient human B cells exhibit altered MHC class II presentation of exogenous antigens , 2010, Immunology.

[28]  P. Low,et al.  Expression of the folate receptor genes FOLR1 and FOLR3 differentiates ovarian carcinoma from breast carcinoma and malignant mesothelioma in serous effusions. , 2009, Human pathology.

[29]  Ujjwal Maulik,et al.  Discovery of MicroRNA markers: An SVM-based multiobjective feature selection approach , 2011, 2011 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB).

[30]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[31]  M. Denti,et al.  RNAi-mediated silencing of ABCD3 gene expression in rat C6 glial cells: A model system to study PMP70 function , 2008, Neurochemistry International.

[32]  J. Whelan,et al.  MRPS27 is a pentatricopeptide repeat domain protein required for the translation of mitochondrially encoded proteins , 2012, FEBS letters.

[33]  Raymond J MacDonald,et al.  DNA Binding and Transcriptional Activation by a PDX1·PBX1b·MEIS2b Trimer and Cooperation with a Pancreas-specific Basic Helix-Loop-Helix Complex* , 2001, The Journal of Biological Chemistry.

[34]  H. Tagawa,et al.  MASL1, a candidate oncogene found in amplification at 8p23.1, is translocated in immunoblastic B-cell lymphoma cell line OCI-LY8 , 2004, Oncogene.

[35]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[36]  Xiaohui Lin,et al.  A support vector machine-recursive feature elimination feature selection method based on artificial contrast variables and mutual information. , 2012, Journal of chromatography. B, Analytical technologies in the biomedical and life sciences.

[37]  Zehang Sun,et al.  Object detection using feature subset selection , 2004, Pattern Recognit..

[38]  Zili Zhang,et al.  A clustering based hybrid system for biomarker selection and sample classification of mass spectrometry data , 2010, Neurocomputing.

[39]  P. Fuller,et al.  Seladin‐1/DHCR24 expression in normal ovary, ovarian epithelial and granulosa tumours , 2005, Clinical endocrinology.

[40]  B. Volkman,et al.  Structural determinants involved in the regulation of CXCL14/BRAK expression by the 26 S proteasome. , 2006, Journal of molecular biology.

[41]  J. Mi,et al.  Characterization of Nuclear Localization Signal in the N Terminus of CUL4B and Its Essential Role in Cyclin E Degradation and Cell Cycle Progression* , 2009, The Journal of Biological Chemistry.

[42]  N. Nomura,et al.  Complete sequencing and characterization of 21,243 full-length human cDNAs , 2004, Nature Genetics.

[43]  Tomoki Yokochi,et al.  LMO3 interacts with p53 and inhibits its transcriptional activity. , 2010, Biochemical and biophysical research communications.

[44]  Krishna R. Kalari,et al.  Protein kinase Cι expression and oncogenic signaling mechanisms in cancer , 2011, Journal of cellular physiology.

[45]  Lamberto Cesari,et al.  Optimization-Theory And Applications , 1983 .

[46]  D. A. O’Brien,et al.  Recombinant human sperm-specific glyceraldehyde-3-phosphate dehydrogenase (GAPDHS) is expressed at high yield as an active homotetramer in baculovirus-infected insect cells. , 2011, Protein expression and purification.

[47]  Ruth S. Waterman,et al.  Ovarian cancers overexpress the antimicrobial protein hCAP‐18 and its derivative LL‐37 increases ovarian cancer cell proliferation and invasion , 2007, International journal of cancer.

[48]  E. Lander,et al.  Gene expression correlates of clinical prostate cancer behavior. , 2002, Cancer cell.

[49]  T. Honjo,et al.  The DSIF Subunits Spt4 and Spt5 Have Distinct Roles at Various Phases of Immunoglobulin Class Switch Recombination , 2012, PLoS genetics.

[50]  F. Marini,et al.  The pro-inflammatory peptide LL-37 promotes ovarian tumor progression through recruitment of multipotent mesenchymal stromal cells , 2009, Proceedings of the National Academy of Sciences.

[51]  K. Tsukinoki,et al.  Restoration of BRAK / CXCL14 gene expression by gefitinib is associated with antitumor efficacy of the drug in head and neck squamous cell carcinoma , 2009, Cancer science.

[52]  M. Sekiguchi Genes to cells: edited by Jun-ichi Tomizawa, Blackwell Science Ltd. Institutional: £218.00 (Europe), £242.00 (Rest of World), US$382.00 (USA and Canada). Individual: £65.00 (Europe), £72.00 (Rest of World), US$114.00 (USA and Canada) ISSN 1356 9597 , 1997 .

[53]  Shutao Li,et al.  Gene Feature Extraction Using T-Test Statistics and Kernel Partial Least Squares , 2006, ICONIP.

[54]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[55]  T. Golub,et al.  Gene expression-based classification of malignant gliomas correlates better with survival than histological classification. , 2003, Cancer research.

[56]  F. Coffman Chitinase 3-Like-1 (CHI3L1): A Putative Disease Marker at the Interface of Proteomics and Glycomics , 2008, Critical reviews in clinical laboratory sciences.

[57]  P. Low,et al.  Functional Folate Receptor Alpha Is Elevated in the Blood of Ovarian Cancer Patients , 2009, PloS one.

[58]  Peng Zhou,et al.  A sequential feature extraction approach for naïve bayes classification of microarray data , 2009, Expert Syst. Appl..

[59]  G. Laurie,et al.  Human genome search in celiac disease: mutated gliadin T-cell-like epitope in two human proteins promotes T-cell activation. , 2002, Journal of molecular biology.

[60]  H. Pehamberger,et al.  Protein kinase C isoforms in normal and transformed cells of the melanocytic lineage , 2002, Melanoma research.

[61]  Karuturi R. Krishna Murthy,et al.  Significance analysis and improved discovery of disease-specific Differentially Co-expressed Gene Sets in microarray data , 2010, Int. J. Data Min. Bioinform..