The ordering of expression among a few genes can provide simple cancer biomarkers and signal BRCA1 mutations

BackgroundA major challenge in computational biology is to extract knowledge about the genetic nature of disease from high-throughput data. However, an important obstacle to both biological understanding and clinical applications is the "black box" nature of the decision rules provided by most machine learning approaches, which usually involve many genes combined in a highly complex fashion. Achieving biologically relevant results argues for a different strategy. A promising alternative is to base prediction entirely upon the relative expression ordering of a small number of genes.ResultsWe present a three-gene version of "relative expression analysis" (RXA), a rigorous and systematic comparison with earlier approaches in a variety of cancer studies, a clinically relevant application to predicting germline BRCA1 mutations in breast cancer and a cross-study validation for predicting ER status. In the BRCA1 study, RXA yields high accuracy with a simple decision rule: in tumors carrying mutations, the expression of a "reference gene" falls between the expression of two differentially expressed genes, PPP1CB and RNF14. An analysis of the protein-protein interactions among the triplet of genes and BRCA 1 suggests that the classifier has a biological foundation.ConclusionRXA has the potential to identify genomic "marker interactions" with plausible biological interpretation and direct clinical applicability. It provides a general framework for understanding the roles of the genes involved in decision rules, as illustrated for the difficult and clinically relevant problem of identifying BRCA 1 mutation carriers.

[1]  Donald Geman,et al.  Large-scale integration of cancer microarray data identifies a robust common cancer signature , 2007, BMC Bioinformatics.

[2]  Kenneth Offit,et al.  Modeling genetic risk of breast cancer. , 2007, JAMA.

[3]  Renee F Wilson,et al.  Systematic Review: Gene Expression Profiling Assays in Early-Stage Breast Cancer , 2008, Annals of Internal Medicine.

[4]  Daniel Birnbaum,et al.  Re: Gene expression profiles of BRCA1-linked, BRCA2-linked, and sporadic ovarian cancers. , 2002, Journal of the National Cancer Institute.

[5]  J. Wang-Rodriguez,et al.  In silico dissection of cell-type-associated patterns of gene expression in prostate cancer. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[6]  Daniel Q. Naiman,et al.  Classifying Gene Expression Profiles from Pairwise mRNA Comparisons , 2004, Statistical applications in genetics and molecular biology.

[7]  L. Hood,et al.  Highly accurate two-gene classifier for differentiating gastrointestinal stromal tumors and leiomyosarcomas , 2007, Proceedings of the National Academy of Sciences.

[8]  J. Welsh,et al.  Analysis of gene expression identifies candidate markers and pharmacological targets in prostate cancer. , 2001, Cancer research.

[9]  Ian M. Donaldson,et al.  BIND: the Biomolecular Interaction Network Database , 2001, Nucleic Acids Res..

[10]  P. Hall,et al.  An expression signature for p53 status in human breast cancer predicts mutation status, transcriptional effects, and patient survival. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[11]  J. Sudbø,et al.  Gene-expression profiles in hereditary breast cancer. , 2001, The New England journal of medicine.

[12]  T. Poggio,et al.  Prediction of central nervous system embryonal tumour outcome based on gene expression , 2002, Nature.

[13]  Giovanni Parmigiani,et al.  Pre-processing Agilent microarray data , 2007, BMC Bioinformatics.

[14]  K. K. Dobbin,et al.  Characterizing dye bias in microarray experiments , 2005, Bioinform..

[15]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[16]  S. Ramaswamy,et al.  Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. , 2002, Cancer research.

[17]  Jean-Jacques Daudin,et al.  Evaluation of the gene-specific dye bias in cDNA microarray experiments , 2005, Bioinform..

[18]  Dipanwita Roy Chowdhury,et al.  Human protein reference database as a discovery resource for proteomics , 2004, Nucleic Acids Res..

[19]  E. Rosen,et al.  BRCA1 in hormone-responsive cancers , 2003, Trends in Endocrinology & Metabolism.

[20]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[21]  I. Ellis,et al.  Expression of BRCA1 protein in breast cancer and its prognostic significance. , 2008, Human pathology.

[22]  Ker-Chau Li,et al.  Genome-wide coexpression dynamics: Theory and application , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Michel Simonneau,et al.  Identification of a novel brain‐specific and reelin‐regulated gene that encodes a protein colocalized with synapsin , 2004, The European journal of neuroscience.

[24]  Yuan Ji,et al.  Extracting three-way gene interactions from microarray data , 2007, Bioinform..

[25]  T. Poggio,et al.  Multiclass cancer diagnosis using tumor gene expression signatures , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[26]  E. Rosen,et al.  BRCA1 gene in breast cancer , 2003, Journal of cellular physiology.

[27]  T. Walsh,et al.  Spectrum of mutations in BRCA1, BRCA2, CHEK2, and TP53 in families at high risk of breast cancer. , 2006, JAMA.

[28]  E. Lander,et al.  Gene expression correlates of clinical prostate cancer behavior. , 2002, Cancer cell.

[29]  J Chang-Claude,et al.  Genetic heterogeneity and penetrance analysis of the BRCA1 and BRCA2 genes in breast cancer families. The Breast Cancer Linkage Consortium. , 1998, American journal of human genetics.

[30]  R. Tibshirani,et al.  Repeated observation of breast tumor subtypes in independent gene expression data sets , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[31]  Carsten O. Peterson,et al.  Estrogen receptor status in breast cancer is associated with remarkably distinct gene expression patterns. , 2001, Cancer research.

[32]  Bob Löwenberg,et al.  A 2-gene classifier for predicting response to the farnesyltransferase inhibitor tipifarnib in acute myeloid leukemia. , 2007, Blood.

[33]  A. Whittemore,et al.  Prevalence of BRCA1 mutation carriers among U.S. non-Hispanic Whites. , 2004, Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology.

[34]  P. Woolf,et al.  A fuzzy logic approach to analyzing gene expression data. , 2000, Physiological genomics.

[35]  R A Irvine,et al.  Breast cancer susceptibility gene 1 (BRCAI) is a coactivator of the androgen receptor. , 2000, Cancer research.

[36]  M. J. van de Vijver,et al.  Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis. , 2006, Journal of the National Cancer Institute.

[37]  Daniel Q. Naiman,et al.  Simple decision rules for classifying human cancers from gene expression profiles , 2005, Bioinform..

[38]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[39]  S. Yeh,et al.  Increase of androgen-induced cell death and androgen receptor transactivation by BRCA1 in prostate cancer cells. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[40]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[41]  Mike Tyers,et al.  BioGRID: a general repository for interaction datasets , 2005, Nucleic Acids Res..

[42]  Daniel Q. Naiman,et al.  Robust prostate cancer marker genes emerge from direct integration of inter-study microarray data , 2005, Bioinform..

[43]  C. Osborne,et al.  Steroid hormone receptors in breast cancer management , 2004, Breast Cancer Research and Treatment.

[44]  L. Bégin,et al.  Germline BRCA1 mutations and a basal epithelial phenotype in breast cancer. , 2004, Journal of the National Cancer Institute.

[45]  William D. Foulkes,et al.  Re: Germline BRCA1 Mutations and a Basal Epithelial Phenotype in Breast Cancer , 2004 .

[46]  Todd,et al.  Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning , 2002, Nature Medicine.

[47]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[48]  Irene L Andrulis,et al.  The interaction of PP1 with BRCA1 and analysis of their expression in breast tumors , 2007, BMC Cancer.