Microarray-Based Cancer Prediction Using Soft Computing Approach

One of the difficulties in using gene expression profiles to predict cancer is how to effectively select a few informative genes to construct accurate prediction models from thousands or ten thousands of genes. We screen highly discriminative genes and gene pairs to create simple prediction models involved in single genes or gene pairs on the basis of soft computing approach and rough set theory. Accurate cancerous prediction is obtained when we apply the simple prediction models for four cancerous gene expression datasets: CNS tumor, colon tumor, lung cancer and DLBCL. Some genes closely correlated with the pathogenesis of specific or general cancers are identified. In contrast with other models, our models are simple, effective and robust. Meanwhile, our models are interpretable for they are based on decision rules. Our results demonstrate that very simple models may perform well on cancerous molecular prediction and important gene markers of cancer can be detected if the gene selection approach is chosen reasonably.

[1]  Anthony K. H. Tung,et al.  Mining top-K covering rule groups for gene expression data , 2005, SIGMOD '05.

[2]  Hugo Sousa,et al.  Linking TP53 codon 72 and P21 nt590 genotypes to the development of cervical and ovarian cancer. , 2006, European journal of cancer.

[3]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Koji Yamada,et al.  Green Tea Polyphenol Epigallocatechin-3-gallate Signaling Pathway through 67-kDa Laminin Receptor* , 2008, Journal of Biological Chemistry.

[5]  Hideyuki Sakurai,et al.  Pretreatment evaluation of combined HIF‐1α, p53 and p21 expression is a useful and sensitive indicator of response to radiation and chemotherapy in esophageal cancer , 2004, International journal of cancer.

[6]  S. Lipkowitz,et al.  A comparative structural characterization of the human NSCL-1 and NSCL-2 genes. Two basic helix-loop-helix genes expressed in the developing nervous system. , 1992, The Journal of biological chemistry.

[7]  L. Hood,et al.  Hevin, an antiadhesive extracellular matrix protein, is down-regulated in metastatic prostate adenocarcinoma. , 1998, Cancer research.

[8]  Bo Wang,et al.  Overexpression of macrophage migration inhibitory factor induces angiogenesis in human breast cancer. , 2008, Cancer letters.

[9]  Giacomo Finocchiaro,et al.  Lap2α Expression is Controlled by E2F and Deregulated in Various Human Tumors , 2006 .

[10]  C. Dabrosin,et al.  Estradiol Increases IL-8 Secretion of Normal Human Breast Tissue and Breast Cancer In Vivo1 , 2009, The Journal of Immunology.

[11]  Sophie Lambert-Lacroix,et al.  Effective dimension reduction methods for tumor classification using gene expression data , 2003, Bioinform..

[12]  R Anbazhagan,et al.  Differential expression of TCEAL1 in esophageal cancers by custom cDNA microarray analysis. , 2005, Diseases of the esophagus : official journal of the International Society for Diseases of the Esophagus.

[13]  S. Fan,et al.  SPARC and Hevin expression correlate with tumour angiogenesis in hepatocellular carcinoma , 2006, The Journal of pathology.

[14]  Scott A. Busby,et al.  HDM2-binding partners: interaction with translation elongation factor EF1alpha. , 2007, Journal of proteome research.

[15]  M Lockley,et al.  Survivin interacts with Smac/DIABLO in ovarian carcinoma cells but is redundant in Smac-mediated apoptosis. , 2005, Experimental cell research.

[16]  Daniel Birnbaum,et al.  Markers of subtypes in inflammatory breast cancer studied by immunohistochemistry: Prominent expression of P-cadherin , 2008, BMC Cancer.

[17]  Sushmita Mitra,et al.  Evolutionary Rough Feature Selection in Gene Expression Data , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[18]  Yusuke Nakamura,et al.  Identification of a Novel Tumor-Associated Antigen, Cadherin 3/P-Cadherin, as a Possible Target for Immunotherapy of Pancreatic, Gastric, and Colorectal Cancers , 2008, Clinical Cancer Research.

[19]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[20]  Robert B Darnell,et al.  β-NAP, a cerebellar degeneration antigen, is a neuron-specific vesicle coat protein , 1995, Cell.

[21]  Myung-Haing Cho,et al.  Aerosol delivery of urocanic acid–modified chitosan/programmed cell death 4 complex regulated apoptosis, cell cycle, and angiogenesis in lungs of K-ras null mice , 2006, Molecular Cancer Therapeutics.

[22]  Jinyan Li,et al.  Identifying good diagnostic gene groups from gene expression profiles using the concept of emerging patterns. , 2002 .

[23]  C. Heizmann,et al.  The S100 family of EF-hand calcium-binding proteins: functions and pathology. , 1996, Trends in biochemical sciences.

[24]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[25]  Lei Xu,et al.  GPR56, an atypical G protein-coupled receptor, binds tissue transglutaminase, TG2, and inhibits melanoma tumor growth and metastasis. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[26]  Makoto Arai,et al.  Methylation Status of Genes Upregulated by Demethylating Agent 5-aza-2′-Deoxycytidine in Hepatocellular Carcinoma , 2007, Oncology.

[27]  P. Schraml,et al.  Characterization of MAST9/Hevin, a SPARC-like protein, that is down-regulated in non-small cell lung cancer. , 1998, Cancer research.

[28]  E Helene Sage,et al.  Hevin/SC1, a matricellular glycoprotein and potential tumor-suppressor of the SPARC/BM-40/Osteonectin family. , 2004, The international journal of biochemistry & cell biology.

[29]  Hiroyuki Nakanishi,et al.  Downregulation of Smac/DIABLO expression in renal cell carcinoma and its prognostic significance. , 2005, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[30]  T. Poggio,et al.  Prediction of central nervous system embryonal tumour outcome based on gene expression , 2002, Nature.

[31]  G. Giaccone,et al.  Role of XIAP in inhibiting cisplatin-induced caspase activation in non-small cell lung cancer cells: a small molecule Smac mimic sensitizes for chemotherapy-induced apoptosis by enhancing caspase-3 activation. , 2007, Experimental cell research.

[32]  M. Yoshida,et al.  Mechanism of cell cycle arrest caused by histone deacetylase inhibitors in human carcinoma cells. , 2000, The Journal of antibiotics.

[33]  Chris H. Q. Ding,et al.  Minimum redundancy feature selection from microarray gene expression data , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[34]  Michael Goggins,et al.  Identification of maspin and S100P as novel hypomethylation targets in pancreatic cancer using global gene expression profiling , 2004, Oncogene.

[35]  Birgit Samans,et al.  Programmed cell death protein 4 suppresses CDK1/cdc2 via induction of p21Waf1/Cip1 , 2004 .

[36]  Michael Bacher,et al.  Macrophage migration inhibitory factor: Roles in regulating tumor cell migration and expression of angiogenic factors in hepatocellular carcinoma , 2003, International journal of cancer.

[37]  M. Tyers,et al.  Molecular profiling of non-small cell lung cancer and correlation with disease-free survival. , 2002, Cancer research.

[38]  T E Reichert,et al.  Truncated P‐cadherin is produced in oral squamous cell carcinoma , 2008, The FEBS journal.

[39]  Kotaro Mizuno,et al.  Expression of Smac/DIABLO is a novel prognostic marker in lung cancer. , 2004, Oncology reports.

[40]  Jiong Wu,et al.  ERbeta exerts multiple stimulative effects on human breast carcinoma cells. , 2004, Oncogene.

[41]  A. Sahin,et al.  Tissue transglutaminase-induced alterations in extracellular matrix inhibit tumor invasion , 2005, Molecular Cancer.

[42]  C. Meijer,et al.  Apoptosis resistance and response to chemotherapy in primary nodal diffuse large B‐cell lymphoma , 2006, Hematological oncology.

[43]  Z. Pawlak Rough Sets: Theoretical Aspects of Reasoning about Data , 1991 .

[44]  Xin Huang,et al.  Macrophage Migration Inhibitory Factor Stimulates Angiogenic Factor Expression and Correlates With Differentiation and Lymph Node Status in Patients With Esophageal Squamous Cell Carcinoma , 2005, Annals of surgery.

[45]  D. Rowley,et al.  Keratinocyte‐derived chemokine induces prostate epithelial hyperplasia and reactive stroma in a novel transgenic mouse model , 2009, The Prostate.

[46]  H E Gabbert,et al.  Disturbed balance of expression between XIAP and Smac/DIABLO during tumour progression in renal cell carcinomas , 2004, British Journal of Cancer.

[47]  P. Kuo,et al.  EF1A1-actin interactions alter mRNA stability to determine differential osteopontin expression in HepG2 and Hep3B cells. , 2009, Experimental cell research.

[48]  N. Colburn,et al.  Tumorigenesis Suppressor Pdcd4 Down-Regulates Mitogen-Activated Protein Kinase Kinase Kinase Kinase 1 Expression To Suppress Colon Carcinoma Cell Invasion , 2006, Molecular and Cellular Biology.

[49]  Tobias Schmid,et al.  Translation inhibitor Pdcd4 is targeted for degradation during tumor promotion. , 2008, Cancer research.

[50]  J. Neefs,et al.  Hevin is down-regulated in many cancers and is a negative regulator of cell growth and proliferation , 2000, British Journal of Cancer.

[51]  Michael I. Jordan,et al.  Feature selection for high-dimensional genomic microarray data , 2001, ICML.

[52]  Desheng Xiao,et al.  Inhibition of fibroblast growth factor 2-induced apoptosis involves survivin expression, protein kinase C alpha activation and subcellular translocation of Smac in human small cell lung cancer cells. , 2008, Acta biochimica et biophysica Sinica.

[53]  Hongyun Zhang,et al.  Gene Selection with Rough Sets for Cancer Classification , 2007, Fourth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2007).

[54]  P. Iversen,et al.  Increased bone marrow microvascular density in haematological malignancies is associated with differential regulation of angiogenic factors , 2009, Leukemia.

[55]  Z. Su,et al.  Translational infidelity and human cancer: role of the PTI-1 oncogene. , 1999, The international journal of biochemistry & cell biology.

[56]  P. Chaurand,et al.  Profiling proteins from azoxymethane‐induced colon tumors at the molecular level by matrix‐assisted laser desorption/ionization mass spectrometry , 2001, Proteomics.

[57]  Takashi Tsuruo,et al.  Predominant suppression of apoptosome by inhibitor of apoptosis protein in non-small cell lung cancer H460 cells: therapeutic effect of a novel polyarginine-conjugated Smac peptide. , 2003, Cancer research.

[58]  Gang Xin,et al.  Transfection of Smac/DIABLO sensitizes drug-resistant tumor cells to TRAIL or paclitaxel-induced apoptosis in vitro. , 2007, Pharmacological research.

[59]  Martin Schostak,et al.  Expression levels of the mitochondrial IAP antagonists Smac/DIABLO and Omi/HtrA2 in clear-cell renal cell carcinomas and their prognostic value , 2008, Journal of Cancer Research and Clinical Oncology.

[60]  Jiong Wu,et al.  ERβ exerts multiple stimulative effects on human breast carcinoma cells , 2004, Oncogene.

[61]  Elizabeth Garrett-Mayer,et al.  A simple two-gene prognostic model for adenocarcinoma of the lung. , 2008, The Journal of thoracic and cardiovascular surgery.

[62]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[63]  Huiqing Liu,et al.  Simple rules underlying gene expression profiles of more than six subtypes of acute lymphoblastic leukemia (ALL) patients , 2003, Bioinform..

[64]  E. Sage,et al.  SPARC and tumor growth: Where the seed meets the soil? , 2004, Journal of cellular biochemistry.

[65]  Hongyun Zhang,et al.  Efficient Gene Selection with Rough Sets from Gene Expression Data , 2008, RSKT.

[66]  Günther Ernst,et al.  Different expression of calgizzarin (S100A11) in normal colonic epithelium, adenoma and colorectal carcinoma. , 2006, International journal of oncology.

[67]  G Jean Harry,et al.  G-protein Pathway Suppressor 2 (GPS2) Interacts with the Regulatory Factor X4 Variant 3 (RFX4_v3) and Functions as a Transcriptional Co-activator* , 2008, Journal of Biological Chemistry.

[68]  S. Raimondi,et al.  Fusion of the leucine zipper gene HLF to the E2A gene in human acute B-lineage leukemia. , 1992, Science.

[69]  Todd,et al.  Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning , 2002, Nature Medicine.

[70]  Ian H. Witten,et al.  Data mining - practical machine learning tools and techniques, Second Edition , 2005, The Morgan Kaufmann series in data management systems.

[71]  Claus W Heizmann,et al.  S100 proteins: structure, functions and pathology. , 2002, Frontiers in bioscience : a journal and virtual library.

[72]  Zhaojing Meng,et al.  Alterations in Gemin5 expression contribute to alternative mRNA splicing patterns and tumor cell motility. , 2008, Cancer research.

[73]  Li Ping Cheng,et al.  Study of structural and electronic origin of ambergris odor of some compounds , 2009, Journal of molecular modeling.

[74]  Huan Liu,et al.  Redundancy based feature selection for microarray data , 2004, KDD.

[75]  Giacomo Finocchiaro,et al.  Lap2alpha expression is controlled by E2F and deregulated in various human tumors. , 2006, Cell cycle.

[76]  Ken Chen,et al.  Macrophage Migration Inhibitory Factor Promotes Colorectal Cancer , 2009, Molecular medicine.

[77]  Alfredo De Lillo,et al.  P-cadherin expression and survival rate in oral squamous cell carcinoma:an immunohistochemical study , 2005, BMC Cancer.

[78]  B. Aronow,et al.  Transcriptional profiles of intestinal tumors in Apc(Min) mice are unique from those of embryonic intestine and identify novel gene targets dysregulated in human colorectal tumors. , 2005, Cancer research.

[79]  Harald Schmidt,et al.  The action of Pdcd4 may be cell type specific: evidence that reduction of dUTPase levels might contribute to its tumor suppressor activity in Bon-1 cells , 2007, Apoptosis.

[80]  Andrei V Bakin,et al.  Silencing of the Tropomyosin-1 gene by DNA methylation alters tumor suppressor function of TGF-β , 2005, Oncogene.

[81]  Daniel Q. Naiman,et al.  Classifying Gene Expression Profiles from Pairwise mRNA Comparisons , 2004, Statistical applications in genetics and molecular biology.

[82]  K. Yoshimoto,et al.  Human calgizzarin; one colorectal cancer-related gene selected by a large scale random cDNA sequencing and northern blot analysis. , 1995, Cancer letters.

[83]  Jinyan Li,et al.  Identifying good diagnostic gene groups from gene expression profiles using the concept of emerging patterns , 2002, Bioinform..

[84]  Jiří Knížek,et al.  Protein abundance alterations in matched sets of macroscopically normal colon mucosa and colorectal carcinoma , 1999, Electrophoresis.

[85]  Carmen Jerónimo,et al.  P-Cadherin Overexpression Is an Indicator of Clinical Outcome in Invasive Breast Carcinomas and Is Associated with CDH3 Promoter Hypomethylation , 2005, Clinical Cancer Research.

[86]  Christopher I Amos,et al.  Common 5p15.33 and 6p21.33 variants influence lung cancer risk , 2008, Nature Genetics.

[87]  Kyoungsook Park,et al.  A novel cervical cancer suppressor 3 (CCS‐3) interacts with the BTB domain of PLZF and inhibits the cell growth by inducing apoptosis , 2006, FEBS letters.

[88]  Peter Schirmacher,et al.  Tumor-suppressor function of SPARC-like protein 1/Hevin in pancreatic cancer. , 2007, Neoplasia.

[89]  S. Ramaswamy,et al.  Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. , 2002, Cancer research.

[90]  T. Shuin,et al.  Expression of Angiogenesis-Related Genes Regulates Different Steps in the Process of Tumor Growth and Metastasis in Human Urothelial Cell Carcinoma of the Urinary Bladder , 2008, Pathobiology.

[91]  A. Bosserhoff,et al.  Functional implication of truncated P-cadherin expression in malignant melanoma. , 2006, Experimental and molecular pathology.

[92]  Dingfang Li,et al.  Gene Selection Using Rough Set Theory , 2006, RSKT.

[93]  C. V. van Noorden,et al.  Promotion of colon cancer metastases in rat liver by fish oil diet is not due to reduced stroma formation , 2004, Clinical & Experimental Metastasis.

[94]  S. Groshen,et al.  Polymorphisms in VEGF and IL-8 predict tumor recurrence in stage III colon cancer. , 2008, Annals of oncology : official journal of the European Society for Medical Oncology.

[95]  Michael Weller,et al.  Smac agonists sensitize for Apo2L/TRAIL- or anticancer drug-induced apoptosis and induce regression of malignant glioma in vivo , 2002, Nature Medicine.

[96]  Elsa Fonseca,et al.  Mucoepidermoid carcinoma of the thyroid: a tumour histotype characterised by P-cadherin neoexpression and marked abnormalities of E-cadherin/catenins complex , 2002, Virchows Archiv.

[97]  Ronald W. Davis,et al.  Quantitative Monitoring of Gene Expression Patterns with a Complementary DNA Microarray , 1995, Science.

[98]  Robert Kiss,et al.  Prognostic Values of Galectin-3 and the Macrophage Migration Inhibitory Factor (MIF) in Human Colorectal Cancers , 2003, Modern Pathology.

[99]  Zijie Sun,et al.  Downregulation of tumor suppressor Pdcd4 promotes invasion and activates both β-catenin/Tcf and AP-1-dependent transcription in colon carcinoma cells , 2008, Oncogene.

[100]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[101]  B. Samans,et al.  Programmed cell death protein 4 suppresses CDK1/cdc2 via induction of p21(Waf1/Cip1). , 2004, American journal of physiology. Cell physiology.

[102]  Felicitas Genze,et al.  Inhibition of clonogenic tumor growth: a novel function of Smac contributing to its antitumor activity , 2005, Oncogene.

[103]  Q Wang,et al.  Hypomethylation of WNT5A, CRIP1 and S100P in prostate cancer , 2007, Oncogene.

[104]  Robert C. Holte,et al.  Very Simple Classification Rules Perform Well on Most Commonly Used Datasets , 1993, Machine Learning.

[105]  B.F. Momin,et al.  Reduct Generation and Classification of Gene Expression Data , 2006, 2006 International Conference on Hybrid Information Technology.

[106]  C. Maccalman,et al.  Cadherin switching in ovarian cancer progression , 2003, International journal of cancer.

[107]  Suresh Mishra,et al.  The p53 oncoprotein is a substrate for tissue transglutaminase kinase activity. , 2006, Biochemical and biophysical research communications.

[108]  G. Folkers,et al.  On the Role of Thymopoietins in Cell Proliferation. Immunochemical Evidence for New Members of the Human Thymopoietin Family , 1999, Biological chemistry.

[109]  Richard Simon,et al.  Supervised analysis when the number of candidate features (p) greatly exceeds the number of cases (n) , 2003, SKDD.

[110]  J. Meléndez-Zajgla,et al.  Apoptosis induced by cAMP requires Smac/DIABLO transcriptional upregulation. , 2007, Cellular signalling.

[111]  Aik Choon Tan,et al.  Ensemble machine learning on gene expression data for cancer classification. , 2003, Applied bioinformatics.

[112]  Fillia Makedon,et al.  HykGene: a hybrid approach for selecting marker genes for phenotype classification using microarray gene expression data , 2005, Bioinform..

[113]  Yadong Wang,et al.  Constructing disease-specific gene networks using pair-wise relevance metric: Application to colon cancer identifies interleukin 8, desmin and enolase 1 as the central elements , 2008, BMC Systems Biology.

[114]  Lin Leng,et al.  Macrophage migration inhibitory factor promotes intestinal tumorigenesis. , 2005, Gastroenterology.