A new compact set of biomarkers for distinguishing among ten breast cancer subtypes

World-wide, one in nine women are diagnosed with breast cancer in their lifetime and breast cancer is the second leading cause of death among women. Accurate diagnosis of the specific subtypes of this disease is vital to ensure that the patients will have the best possible response to therapy. Using the newly proposed ten subtypes of breast cancer we hypothesized that machine learning techniques would offer many benefits for selecting the most informative biomarkers. Unlike existing gene selection approaches, we use a hierarchical classification approach that selects genes and builds the classifier concurrently. Our results support that this modified approach to gene selection yields a small subset of 82 genes that can predict each of these ten subtypes with accuracies ranging from 92% to 99%.

[1]  Karl Mechtler,et al.  Sororin Is Required for Stable Binding of Cohesin to Chromatin and for Sister Chromatid Cohesion in Interphase , 2007, Current Biology.

[2]  Huan Liu,et al.  Chi2: feature selection and discretization of numeric attributes , 1995, Proceedings of 7th IEEE International Conference on Tools with Artificial Intelligence.

[3]  Chris H. Q. Ding,et al.  Minimum Redundancy Feature Selection from Microarray Gene Expression Data , 2005, J. Bioinform. Comput. Biol..

[4]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[5]  Youping Deng,et al.  Feature Selection and Classification of MAQC-II Breast Cancer and Multiple Myeloma Microarray Gene Expression Data , 2009, PloS one.

[6]  Bert Vogelstein,et al.  Mutations of mitotic checkpoint genes in human cancers , 1998, Nature.

[7]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[8]  Xiaoxing Liu,et al.  An Entropy-based gene selection method for cancer classification using microarray data , 2005, BMC Bioinformatics.

[9]  Gurkan Bebek,et al.  FOXA1 Represses the Molecular Phenotype of Basal Breast Cancer Cells , 2012, Oncogene.

[10]  Shigeo Abe Support Vector Machines for Pattern Classification , 2010, Advances in Pattern Recognition.

[11]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[12]  Christian A. Rees,et al.  Molecular portraits of human breast tumours , 2000, Nature.

[13]  F. Markowetz,et al.  The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups , 2012, Nature.

[14]  Mohd Saberi Mohamad,et al.  Particle swarm optimization for gene selection in classifying cancer classes , 2009, Artificial Life and Robotics.

[15]  Alioune Ngom,et al.  Identifying Informative Genes for Prediction of Breast Cancer Subtypes , 2013, PRIB.

[16]  Alioune Ngom,et al.  A Framework of Gene Subset Selection Using Multiobjective Evolutionary Algorithm , 2012, PRIB.

[17]  Tao Li,et al.  A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression , 2004, Bioinform..

[18]  Juan Liu,et al.  Mixture classification model based on clinical markers for breast cancer prognosis , 2010, Artif. Intell. Medicine.

[19]  Reshma Khemchandani,et al.  Twin Support Vector Machines for Pattern Classification , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  K. Gould,et al.  Complementation of the mitotic activator, p80cdc25, by a human protein-tyrosine phosphatase , 1990, Science.

[21]  John R. Yates,et al.  The human CENP-A centromeric nucleosome-associated complex , 2006, Nature Cell Biology.

[22]  G. Hannon,et al.  KAP: a dual specificity phosphatase that interacts with cyclin-dependent kinases. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[24]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  B. Roberts,et al.  The Saccharomyces cerevisiae checkpoint gene BUB1 encodes a novel protein kinase. , 1994, Molecular and cellular biology.

[26]  Sheng Wang,et al.  Polo‐like kinase 1 regulates mitotic arrest after UV irradiation through dephosphorylation of p53 and inducing p53 degradation , 2006, FEBS letters.

[27]  A. Nobel,et al.  The molecular portraits of breast tumors are conserved across microarray platforms , 2006, BMC Genomics.