Biodiversity informatics: organizing and linking information across the spectrum of life

Biological knowledge can be inferred from three major levels of information: molecules, organisms and ecologies. Bioinformatics is an established field that has made significant advances in the development of systems and techniques to organize contemporary molecular data; biodiversity informatics is an emerging discipline that strives to develop methods to organize knowledge at the organismal level extending back to the earliest dates of recorded natural history. Furthermore, while bioinformatics studies generally focus on detailed examinations of key 'model' organisms, biodiversity informatics aims to develop over-arching hypotheses that span the entire tree of life. Biodiversity informatics is presented here as a discipline that unifies biological information from a range of contemporary and historical sources across the spectrum of life using organisms as the linking thread. The present review primarily focuses on the use of organism names as a universal metadata element to link and integrate biodiversity data across a range of data sources.

[1]  Gerard Salton,et al.  Another look at automatic text-retrieval systems , 1986, CACM.

[2]  Sp Lapage,et al.  International Code of Nomenclature of Bacteria , 1992 .

[3]  H. Lowe,et al.  Understanding and using the medical subject headings (MeSH) vocabulary to perform literature searches. , 1994, JAMA.

[4]  G. Schuler,et al.  Entrez: molecular biology database and retrieval system. , 1996, Methods in enzymology.

[5]  Limsoon Wong,et al.  BioKleisli: a digital library for biomedical researchers , 1997, International Journal on Digital Libraries.

[6]  Natalia Ivanova,et al.  The metabolic pathway collection: an update , 1997, Nucleic Acids Res..

[7]  Russ B. Altman,et al.  RIBOWEB: Linking Structural Computations to a Knowledge Base of Published Experimental Data , 1997, ISMB.

[8]  Andrew Booth,et al.  Another fine MeSH: clinical medicine meets information science , 1999, J. Inf. Sci..

[9]  E. Meyerowitz Today we have naming of parts , 1999, Nature.

[10]  William H. Mischo,et al.  Federated Search of Scientific Literature , 1999, Computer.

[11]  P. Daszak,et al.  Emerging infectious diseases of wildlife--threats to biodiversity and human health. , 2000, Science.

[12]  Peter D. Karp,et al.  The EcoCyc and MetaCyc databases , 2000, Nucleic Acids Res..

[13]  Shahrokh Saeednia,et al.  How to maintain both privacy and authentication in digital libraries , 2000 .

[14]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[15]  J L Edwards,et al.  Interoperability of biodiversity databases: biodiversity information on every desktop. , 2000, Science.

[16]  F. Bisby The quiet revolution: biodiversity informatics and the internet. , 2000, Science.

[17]  Nicholas J. Turland,et al.  International code of botanical nomenclature (Saint Louis Code): Sixteenth International Botanical Congress, St Louis, Missouri, USA, July-August 1999. , 2000 .

[18]  Anton J. Enright,et al.  TEXTQUEST: Document Clustering of MEDLINE Abstracts For Concept Discovery In Molecular Biology , 2000, Pacific Symposium on Biocomputing.

[19]  Jun'ichi Tsujii,et al.  Event Extraction from Biomedical Papers Using a Full Parser , 2000, Pacific Symposium on Biocomputing.

[20]  J. Blake,et al.  Creating the Gene Ontology Resource : Design and Implementation The Gene Ontology Consortium 2 , 2001 .

[21]  Jong C. Park,et al.  Bidirectional Incremental Parsing for Automatic Pathway Identification with Combinatory Categorial Grammar , 2000, Pacific Symposium on Biocomputing.

[22]  Padmini Srinivasan,et al.  MeSHmap: a text mining tool for MEDLINE , 2001, AMIA.

[23]  S. Blackmore Environment. Biodiversity update--progress in taxonomy. , 2002, Science.

[24]  H. Cunningham,et al.  A framework and graphical development environment for robust NLP tools and applications. , 2002, ACL 2002.

[25]  Jeffrey T. Chang,et al.  Associating genes with gene ontology codes using a maximum entropy analysis of biomedical literature. , 2002, Genome research.

[26]  S. Blackmore Biodiversity Update--Progress in Taxonomy , 2002, Science.

[27]  V. Gewin Taxonomy: All living things, online , 2002, Nature.

[28]  M. Morrison,et al.  The Ecology and Evolutionary History of an Emergent Disease: Hantavirus Pulmonary Syndrome , 2002 .

[29]  Vincent Lombard,et al.  The EMBL Nucleotide Sequence Database , 2002, Nucleic Acids Res..

[30]  Michael Ashburner,et al.  On ontologies for biologists: the Gene Ontology--untangling the web. , 2002, Novartis Foundation symposium.

[31]  Limsoon Wong,et al.  Accomplishments and challenges in literature data mining for biology , 2002, Bioinform..

[32]  Progressing towards a biological names register , 2003, Nature.

[33]  Alexander A. Morgan,et al.  Evaluation of text data mining for database curation: lessons learned from the KDD Challenge Cup , 2003, ISMB.

[34]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[35]  D. Janzen,et al.  Museum Collections and Taxonomy , 2004, Science.

[36]  Susanne M. Humphrey,et al.  The NLM Indexing Initiative's Medical Text Indexer , 2004, MedInfo.

[37]  A. Peterson,et al.  Biodiversity informatics: managing and applying primary biodiversity data. , 2004, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[38]  A. Suarez,et al.  The Value of Museum Collections for Research and Society , 2004 .

[39]  Roderic D. M. Page,et al.  A Taxonomic Search Engine: Federating taxonomic databases using web services , 2005, BMC Bioinformatics.

[40]  A. Polaszek A universal register for animal names , 2005, Nature.

[41]  Indra Neil Sarkar,et al.  Taxongrab: Extracting Taxonomic Names from Text , 2005 .

[42]  E. Wilson Systematics and the future of biology , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[43]  D. Grimaldi,et al.  Evolution of the insects , 2005 .

[44]  Judith A. Blake,et al.  The Mouse Genome Database (MGD): from genes to mice—a community resource for mouse biology , 2004, Nucleic Acids Res..

[45]  Catherine N. Norton,et al.  Taxonomic indexing--extending the role of taxonomy. , 2006, Systematic biology.

[46]  R. Guralnick,et al.  BioGeomancer: Automated Georeferencing to Map the World's Biodiversity Data , 2006, PLoS biology.

[47]  D. Agosti Biodiversity data are out of local taxonomists' reach , 2006, Nature.

[48]  Norman F Johnson,et al.  Biodiversity informatics. , 2007, Annual review of entomology.

[49]  C. Marshall Encyclopedia of Life , 2008 .