Combining Gene Expression Profiles and Drug Activity Patterns Analysis: A Relational Clustering Approach

The combined analysis of tissue micro array and drug response datasets has the potential of revealing valuable knowledge about various relations among gene expressions and drug activity patterns in tumor cells. However, the amount and the complexity of biological data needs appropriate data mining models in order to extract interesting patterns and useful information. The ultimate goal of this paper is to define a model which, given the gene expression profile related to a specific tumor tissue, could help in selecting a set of most responsive drugs. This is accomplished through an integrated framework based on a constraint-based clustering algorithm, called Relational K-Means, which groups cell lines using drug response information and taking into account cell-to-cell relationships derived from their gene expression profiles.

[1]  Jacob Cohen,et al.  Applied multiple regression/correlation analysis for the behavioral sciences , 1979 .

[2]  Nadir Arber,et al.  The novel oncogene CD24 and its arising role in the carcinogenesis of the GI tract: from research to therapy , 2008, Expert review of gastroenterology & hepatology.

[3]  Tommi S. Jaakkola,et al.  Using Graphical Models and Genomic Expression Data to Statistically Validate Models of Genetic Regulatory Networks , 2000, Pacific Symposium on Biocomputing.

[4]  Qing Shao,et al.  Retroviral Delivery of Connexin Genes to Human Breast Tumor Cells Inhibits in Vivo Tumor Growth by a Mechanism That Is Independent of Significant Gap Junctional Intercellular Communication* , 2002, The Journal of Biological Chemistry.

[5]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[6]  E. Sage,et al.  A prototypic matricellular protein in the tumor microenvironment—Where there's SPARC, there's fire , 2008, Journal of cellular biochemistry.

[7]  H. L. Le Roy,et al.  Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability; Vol. IV , 1969 .

[8]  Henrik Vorum,et al.  Calumenin but not reticulocalbin forms a Ca2+-dependent complex with thrombospondin-1. A potential role in haemostasis and thrombosis , 2008, Molecular and Cellular Biochemistry.

[9]  G. Thomas,et al.  Upregulation of Eps8 in oral squamous cell carcinoma promotes cell migration and invasion through integrin-dependent Rac1 activation , 2009, Oncogene.

[10]  Aldons J Lusis,et al.  Paraoxonase-2 Deficiency Aggravates Atherosclerosis in Mice Despite Lower Apolipoprotein-B-containing Lipoproteins , 2006, Journal of Biological Chemistry.

[11]  Beatrice Lazzerini,et al.  A new fuzzy relational clustering algorithm based on the fuzzy C-means algorithm , 2005, Soft Comput..

[12]  I. Fraser,et al.  Regulation of cAMP Responses by the G12/13 Pathway Converges on Adenylyl Cyclase VII* , 2008, Journal of Biological Chemistry.

[13]  B. Nawrot,et al.  S100A6 binds p53 and affects its activity. , 2009, The international journal of biochemistry & cell biology.

[14]  Klaus Obermayer,et al.  Self-organizing maps: Generalizations and new optimization techniques , 1998, Neurocomputing.

[15]  J. Mesirov,et al.  Chemosensitivity prediction by transcriptional profiling , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[16]  D. Botstein,et al.  A gene expression database for the molecular pharmacology of cancer , 2000, Nature Genetics.

[17]  Ali Jalilian,et al.  Association of cys 311 ser polymorphism of paraoxonase-2 gene with the risk of coronary artery disease. , 2008, Archives of Iranian medicine.

[18]  Cicek Gercel-Taylor,et al.  Loss of communication in ovarian cancer. , 2006, American journal of obstetrics and gynecology.

[19]  Byoung-Tak Zhang,et al.  Analysis of Gene Expression Profiles and Drug Activity Patterns by Clustering and Bayesian Network Learning , 2002 .

[20]  Karl Münger,et al.  Depletion of physiological levels of the human TID1 protein renders cancer cell lines resistant to apoptosis mediated by multiple exogenous stimuli , 2004, Oncogene.

[21]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Claire Cardie,et al.  Proceedings of the Eighteenth International Conference on Machine Learning, 2001, p. 577–584. Constrained K-means Clustering with Background Knowledge , 2022 .

[23]  Teofilo F. GONZALEZ,et al.  Clustering to Minimize the Maximum Intercluster Distance , 1985, Theor. Comput. Sci..

[24]  R. Sokal,et al.  Numerical Taxonomy: The Principles and Practice of Numerical Classification. , 1975 .

[25]  Yuichiro Sato,et al.  Strong suppression of tumor growth by insulin‐like growth factor‐binding protein‐related protein 1/tumor‐derived cell adhesion factor/mac25 , 2007, Cancer science.

[26]  Raymond J. Mooney,et al.  A probabilistic framework for semi-supervised clustering , 2004, KDD.

[27]  B. Eren,et al.  MMP-2, TIMP-2 and CD44v6 expression in non-small-cell lung carcinomas. , 2008, Annals of the Academy of Medicine, Singapore.

[28]  R. Fariss,et al.  Cell density-dependent nuclear/cytoplasmic localization of NORPEG (RAI14) protein. , 2006, Biochemical and biophysical research communications.

[29]  Seo-Hee Kim,et al.  CD24 overexpression in cancer development and progression: a meta-analysis. , 2009, Oncology reports.

[30]  D. Jäger,et al.  Antigens recognized by autologous antibody in patients with renal‐cell carcinoma , 1999, International journal of cancer.

[31]  Suzanne D. Conzen,et al.  Glucocorticoid Receptor-mediated Protection from Apoptosis Is Associated with Induction of the Serine/Threonine Survival Kinase Gene, sgk-1 * , 2001, The Journal of Biological Chemistry.

[32]  Heikki Mannila,et al.  Principles of Data Mining , 2001, Undergraduate Topics in Computer Science.

[33]  Franck Molina,et al.  Gene expression signature in advanced colorectal cancer patients select drugs and response for the use of leucovorin, fluorouracil, and irinotecan. , 2007, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[34]  Simon Lin,et al.  Methods of microarray data analysis III , 2002 .

[35]  A J Day,et al.  Molecular and functional characterization of amylin, a peptide associated with type 2 diabetes mellitus. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[36]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[37]  Lawrence Carin,et al.  Modeling Pharmacogenomics of the NCI-60 Anticancer Data Set: Utilizing Kernel Pls to Correlate the Microarray Data to Therapeutic Responses , 2002 .

[38]  Joshua M. Stuart,et al.  MICROARRAY EXPERIMENTS : APPLICATION TO SPORULATION TIME SERIES , 1999 .

[39]  Takeshi Iwamura,et al.  SERPINE2 (protease nexin I) promotes extracellular matrix production and local invasion of pancreatic tumors in vivo. , 2003, Cancer research.

[40]  Vito Barbieri,et al.  IL-2 signals through Sgk1 and inhibits proliferation and apoptosis in kidney cancer cells , 2007, Journal of Molecular Medicine.

[41]  Raymond J. Mooney,et al.  Integrating constraints and metric learning in semi-supervised clustering , 2004, ICML.

[42]  T. Tatusova,et al.  Entrez Gene: gene-centered information at NCBI , 2010, Nucleic Acids Res..

[43]  Jianguo Jin,et al.  Akt Activation in Platelets Depends on Gi Signaling Pathways* , 2004, Journal of Biological Chemistry.

[44]  Henry J. Donahue,et al.  Expressing connexin 43 in breast cancer cells reduces their metastasis to lungs , 2008, Clinical & Experimental Metastasis.

[45]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[46]  Mark A. Hall,et al.  Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning , 1999, ICML.

[47]  M. Ringnér,et al.  Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks , 2001, Nature Medicine.

[48]  Byoung-Tak Zhang,et al.  Applying Machine Learning Techniques to Analysis of Gene Expression Data: Cancer Diagnosis , 2002 .