Target discovery from data mining approaches.

Data mining of available biomedical data and information has greatly boosted target discovery in the 'omics' era. Target discovery is the key step in the biomarker and drug discovery pipeline to diagnose and fight human diseases. In biomedical science, the 'target' is a broad concept ranging from molecular entities (such as genes, proteins and miRNAs) to biological phenomena (such as molecular functions, pathways and phenotypes). Within the context of biomedical science, data mining refers to a bioinformatics approach that combines biological concepts with computer tools or statistical methods that are mainly used to discover, select and prioritize targets. In response to the huge demand of data mining for target discovery in the 'omics' era, this review explicates various data mining approaches and their applications to target discovery with emphasis on text and microarray data analysis. Two emerging data mining approaches, chemogenomic data mining and proteomic data mining, are briefly introduced. Also discussed are the limitations of various data mining approaches found in the level of database integration, the quality of data annotation, sample heterogeneity and the performance of analytical and mining tools. Tentative strategies of integrating different data sources for target discovery, such as integrated text mining with high-throughput data analysis and integrated mining with pathway databases, are introduced.

[1]  Atul Butte,et al.  The use and analysis of microarray data , 2002, Nature Reviews Drug Discovery.

[2]  Barend Mons,et al.  Text-derived concept profiles support assessment of DNA microarray data for acute myeloid leukemia and for androgen receptor stimulation , 2007, BMC Bioinformatics.

[3]  K. Bretonnel Cohen,et al.  Getting Started in Text Mining , 2008, PLoS Comput. Biol..

[4]  Limsoon Wong,et al.  Accomplishments and challenges in literature data mining for biology , 2002, Bioinform..

[5]  Seungyoon Nam,et al.  Clinical validity of the lung cancer biomarkers identified by bioinformatics analysis of public expression data. , 2007, Cancer research.

[6]  Sophia Ananiadou,et al.  Text mining and its potential applications in systems biology. , 2006, Trends in biotechnology.

[7]  E. Jacoby,et al.  Chemogenomics: an emerging strategy for rapid target and drug discovery , 2004, Nature Reviews Genetics.

[8]  Alon Y. Halevy,et al.  Data integration and genomic medicine , 2007, J. Biomed. Informatics.

[9]  Jun'ichi Tsujii,et al.  New challenges for text mining: mapping between text and manually curated pathways , 2008, BMC Bioinformatics.

[10]  D. Rebholz-Schuhmann,et al.  Facts from Text—Is Text Mining Ready to Deliver? , 2005, PLoS biology.

[11]  Ho Jeong Kwon Discovery of new small molecules and targets towards angiogenesis via chemical genomics approach. , 2006, Current drug targets.

[12]  Steven Butcher Target Discovery and Validation in the Post-Genomic Era , 2003, Neurochemical Research.

[13]  A. Chinnaiyan,et al.  Bioinformatics Strategies for Translating Genome‐Wide Expression Analyses into Clinically Useful Cancer Markers , 2004, Annals of the New York Academy of Sciences.

[14]  De-An Guo,et al.  Proteomics Characterization of the Cytotoxicity Mechanism of Ganoderic Acid D and Computer-automated Estimation of the Possible Drug Target Network*S , 2008, Molecular & Cellular Proteomics.

[15]  Arthur Wuster,et al.  Chemogenomics and biotechnology. , 2008, Trends in biotechnology.

[16]  Pavel Pospisil,et al.  Computational modeling and experimental evaluation of a novel prodrug for targeting the extracellular space of prostate tumors. , 2007, Cancer research.

[17]  Lakshmanan K. Iyer,et al.  A combined approach to data mining of textual and structured data to identify cancer-related targets , 2006, BMC Bioinformatics.

[18]  S. Adelstein,et al.  Integrative Genomic Data Mining for Discovery of Potential Blood-Borne Biomarkers for Early Diagnosis of Cancer , 2008, PloS one.

[19]  F. Sams-Dodd Target-based drug discovery: is something wrong? , 2005, Drug discovery today.

[20]  Fabien Campagne,et al.  Mining expressed sequence tags identifies cancer markers of clinical interest , 2006, BMC Bioinformatics.

[21]  T. Gilliam,et al.  Molecular triangulation: bridging linkage and molecular-network information for identifying candidate genes in Alzheimer's disease. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Patrik D'haeseleer,et al.  How does gene expression clustering work? , 2005, Nature Biotechnology.

[23]  Dragomir R. Radev,et al.  Identifying gene-disease associations using centrality on a literature mined gene-interaction network , 2008, ISMB.

[24]  A. I.,et al.  Neural Field Continuum Limits and the Structure–Function Partitioning of Cognitive–Emotional Brain Networks , 2023, Biology.

[25]  Jeyakumar Natarajan,et al.  Text mining of full-text journal articles combined with gene expression analysis reveals a relationship between sphingosine-1-phosphate and invasiveness of a glioblastoma cell line , 2006, BMC Bioinformatics.

[26]  Jennifer A. Siepen,et al.  PepSeeker: mining information from proteomic data. , 2008, Methods in molecular biology.

[27]  M. Lindsay Target discovery , 2003, Nature Reviews Drug Discovery.

[28]  M. Rivera,et al.  Analysis of genomic and proteomic data using advanced literature mining. , 2003, Journal of proteome research.

[29]  Dalia Cohen,et al.  Genomic approaches to drug discovery. , 2006, Current opinion in chemical biology.

[30]  Zemin Zhang,et al.  Bioinformatics and cancer target discovery. , 2004, Drug discovery today.

[31]  M. Sakharkar,et al.  Targetability of human disease genes. , 2007, Current drug discovery technologies.

[32]  R. Narayanan,et al.  Bioinformatics approaches to cancer gene discovery. , 2007, Methods in molecular biology.

[33]  David S. Wishart,et al.  Nucleic Acids Research Polysearch: a Web-based Text Mining System for Extracting Relationships between Human Diseases, Genes, Mutations, Drugs Polysearch: a Web-based Text Mining System for Extracting Relationships between Human Diseases, Genes, Mutations, Drugs and Metabolites , 2008 .

[34]  S. Hanash,et al.  Mining the plasma proteome for cancer biomarkers , 2008, Nature.

[35]  B. Loftus,et al.  In silico mining identifies IGFBP3 as a novel target of methylation in prostate cancer , 2007, British Journal of Cancer.

[36]  D. Mount,et al.  Using bioinformatics and genome analysis for new therapeutic interventions , 2005, Molecular Cancer Therapeutics.

[37]  Feng Chen,et al.  Identifying targets for drug discovery using bioinformatics , 2008, Expert opinion on therapeutic targets.

[38]  Shao Li,et al.  Constructing biological networks through combined literature mining and microarray analysis: a LMMA approach , 2006, Bioinform..

[39]  Jia-Ren Lin,et al.  An application of bioinformatics and text mining to the discovery of novel genes related to bone biology. , 2007, Bone.

[40]  Ivan C Gerling,et al.  New Data Analysis and Mining Approaches Identify Unique Proteome and Transcriptome Markers of Susceptibility to Autoimmune Diabetes* , 2006, Molecular & Cellular Proteomics.

[41]  John R. Gilbertson,et al.  Microarray Data Mining Using Gene Ontology , 2004, MedInfo.

[42]  P. Bork,et al.  Literature mining for the biologist: from information retrieval to biological discovery , 2006, Nature Reviews Genetics.

[43]  Jin Zhao,et al.  GenCLiP: a software program for clustering gene lists by literature profiling and constructing gene co-occurrence networks related to custom keywords , 2008, BMC Bioinformatics.