Biomedical Literature Mining for Biological Databases Annotation

Margherita Berardi 1,5, Donato Malerba1 , Roberta Piredda2, Marcella Attimonelli2, Gaetano Scioscia3,4 and Pietro Leo3,4 1Dipartimento di Informatica – Universita degli Studi di Bari 2Dipartimento di Biochimica e Biologia Molecolare “E.Quagliariello" – Universita degli Studi di Bari 3IBM Italia S.p.A. Molecular Biodiversity Laboratory 4IBM Italia S.p.A. GBS Innovation Centre 5Exhicon S.r.l., Bari Italy

[1]  Shan-Hwei Nienhuys-Cheng,et al.  Foundations of Inductive Logic Programming , 1997, Lecture Notes in Computer Science.

[2]  G. Pesole,et al.  A novel method for estimating substitution rate variation among sites in a large dataset of homologous DNA sequences. , 2001, Genetics.

[3]  Alexander A. Morgan,et al.  Evaluation of text data mining for database curation: lessons learned from the KDD Challenge Cup , 2003, ISMB.

[4]  Giorgio Levi,et al.  Generalized AND/OR Graphs , 1976, Artif. Intell..

[5]  P. Bork,et al.  Literature mining for the biologist: from information retrieval to biological discovery , 2006, Nature Reviews Genetics.

[6]  S. Dimauro,et al.  The genetics and pathology of oxidative phosphorylation , 2001, Nature Reviews Genetics.

[7]  Alfonso Valencia,et al.  Text-mining approaches in molecular biology and biomedicine. , 2005, Drug discovery today.

[8]  Peter Willett,et al.  Readings in information retrieval , 1997 .

[9]  D. Wallace,et al.  Mitochondrial DNA variation in human evolution and disease. , 1999, Gene.

[10]  Marcella Attimonelli,et al.  HmtDB, a Human Mitochondrial Genomic Resource Based on Variability Studies Supporting Population Genetics and Biomedical Research , 2005, BMC Bioinformatics.

[11]  G. Valle,et al.  Do the four clades of the mtDNA haplogroup L2 evolve at different rates? , 2001, American journal of human genetics.

[12]  M. Attimonelli,et al.  Human mtDNA site‐specific variability values can act as haplogroup markers , 2006, Human mutation.

[13]  Peer Bork,et al.  LSAT: learning about alternative transcripts in MEDLINE , 2006, Bioinform..

[14]  J. W. Lloyd,et al.  Foundations of logic programming; (2nd extended ed.) , 1987 .

[15]  Claire Nédellec,et al.  Machine Learning for Information Extraction in Genomics — State of the Art and Perspectives , 2004 .

[16]  James S. Aitken Learning Information Extraction Rules: An Inductive Logic Programming approach , 2002, ECAI.

[17]  Zhiyong Lu,et al.  Generif Quality Assurance as Summary Revision , 2006, Pacific Symposium on Biocomputing.

[18]  D. Turnbull,et al.  Reanalysis and revision of the Cambridge reference sequence for human mitochondrial DNA , 1999, Nature Genetics.

[19]  Padmini Srinivasan,et al.  Text mining: Generating hypotheses from MEDLINE , 2004, J. Assoc. Inf. Sci. Technol..

[20]  Ashwin Srinivasan,et al.  Using ILP to Construct Features for Information Extraction from Semi-structured Text , 2007, ILP.

[21]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[22]  Hagit Shatkay,et al.  Mining the Biomedical Literature in the Genomic Era: An Overview , 2003, J. Comput. Biol..

[23]  Mark Craven,et al.  Constructing Biological Knowledge Bases by Extracting Information from Text Sources , 1999, ISMB.

[24]  Michèle Sebag,et al.  Scalability and efficiency in multi-relational data mining , 2003, SKDD.

[25]  Alfonso Valencia,et al.  Overview of BioCreAtIvE: critical assessment of information extraction for biology , 2005, BMC Bioinformatics.

[26]  Yorick Wilks,et al.  Information Extraction: Beyond Document Retrieval , 1998, Int. J. Comput. Linguistics Chin. Lang. Process..

[27]  Donato Malerba,et al.  Learning Recursive Patterns for Biomedical Information Extraction , 2007, ILP.

[28]  Fred E. Cohen,et al.  Automated extraction of mutation data from the literature: application of MuteXt to G protein-coupled receptors and nuclear hormone receptors , 2004, Bioinform..

[29]  J Allan,et al.  Readings in information retrieval. , 1998 .

[30]  Jude W. Shavlik,et al.  Learning Ensembles of First-Order Clauses for Recall-Precision Curves: A Case Study in Biomedical Information Extraction , 2004, ILP.

[31]  Graziano Pesole,et al.  The estimation of relative site variability among aligned homologous protein sequences , 2003, Bioinform..

[32]  Donato Malerba,et al.  On the Effect of Caching in Recursive Theory Learning , 2004, ILP.

[33]  J. Lloyd Foundations of Logic Programming , 1984, Symbolic Computation.

[34]  Donato Malerba,et al.  Learning Recursive Theories in the Normal ILP Setting , 2003, Fundam. Informaticae.

[35]  D. Rebholz-Schuhmann,et al.  Facts from Text—Is Text Mining Ready to Deliver? , 2005, PLoS biology.

[36]  William R. Hersh,et al.  A survey of current work in biomedical text mining , 2005, Briefings Bioinform..