BioRegistry: Automatic extraction of metadata for biological database retrieval and discovery

The need for a well-maintained searchable directory is an important issue with regard to the numerous biological databases produced by genomic and post-genomic research. The BioRegistry repository aims to associate content metadata belonging to a biomedical thesaurus with biological databases in view of retrieval or discovery. It is automatically generated from a publicly available list of biological databases. The querying modalities include a search by semantic similarity. The system performance is evaluated in terms of precision and recall on a collection test. A classification method is proposed for browsing and discovering databases through the BioRegistry.

[1]  Patricia Rodriguez-Tomé The BioCatalog , 1998, Bioinform..

[2]  Stuart Weibel The State of the Dublin Core Metadata Initiative , 1999 .

[3]  Emmanuel Barillot,et al.  DBcat: a catalog of 500 biological databases , 2000, Nucleic Acids Res..

[4]  Betsy L. Humphreys,et al.  Relationships in Medical Subject Headings (MeSH) , 2001 .

[5]  Maurizio Lenzerini,et al.  Data integration: a theoretical perspective , 2002, PODS.

[6]  Marie-Christine Rousset Knowledge Representation for Information Integration , 2002, ISMIS.

[7]  Uwe Scholz,et al.  BioDataServer: A SQL-based service for the online integration of life science data , 2002, Silico Biol..

[8]  Gerd Stumme,et al.  Computing iceberg concept lattices with T , 2002, Data Knowl. Eng..

[9]  Zoé Lacroix,et al.  The biological integration system , 2003, WIDM '03.

[10]  Carole A. Goble,et al.  A Suite of Daml+Oil Ontologies to Describe Bioinformatics Web Services and Data , 2003, Int. J. Cooperative Inf. Syst..

[11]  Gene Ontology Consortium The Gene Ontology (GO) database and informatics resource , 2003 .

[12]  Stuart Weibel,et al.  State of the Dublin Core Metadata Initiative, April 2003 , 2003, D Lib Mag..

[13]  Jennifer Widom,et al.  Exploiting hierarchical domain structure to compute similarity , 2003, TOIS.

[14]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[15]  Peter Mork,et al.  The BioMediator System as a Tool for Integrating Biologic Databases on the Web , 2004 .

[16]  B Marshall,et al.  Gene Ontology Consortium: The Gene Ontology (GO) database and informatics resource , 2004, Nucleic Acids Res..

[17]  Joyce A. Mitchell,et al.  The BioMediator System as a Data Integration Tool to Answer Diverse Biologic Queries , 2004, MedInfo.

[18]  David Martin,et al.  GOToolBox: functional analysis of gene datasets based on Gene Ontology , 2004, Genome Biology.

[19]  Carole A. Goble,et al.  Feta: A Light-Weight Architecture for User Oriented Semantic Service Discovery , 2005, ESWC.

[20]  Amedeo Napoli,et al.  BioRegistry: A Structured Metadata Repository for Bioinformatic Databases , 2005, CompLife.

[21]  Alan F. Smeaton,et al.  On the use of Clustering and the MeSH Controlled Vocabulary to Improve MEDLINE Abstract Search , 2005, CORIA.

[22]  Russ B. Altman,et al.  Time to Organize the Bioinformatics Resourceome , 2005, PLoS Comput. Biol..

[23]  Juliana Freire,et al.  Combining classifiers to identify online databases , 2007, WWW '07.

[24]  David S. Wishart,et al.  Biospider: A Web Server for Automating Metabolome Annotations , 2007, Pacific Symposium on Biocomputing.

[25]  Alfonso Valencia,et al.  Interoperability with Moby 1.0--it's better than sharing your toothbrush! , 2008, Briefings in bioinformatics.

[26]  Amedeo Napoli,et al.  Many-Valued Concept Lattices for Conceptual Clustering and Information Retrieval , 2008, ECAI.

[27]  Michael Y. Galperin The Molecular Biology Database Collection: 2008 update , 2007, Nucleic Acids Res..

[28]  Scott McMillan,et al.  Keeping pace with the data: 2008 update on the Bioinformatics Links Directory , 2008, Nucleic Acids Res..

[29]  Michael Y. Galperin,et al.  Nucleic Acids Research annual Database Issue and the NAR online Molecular Biology Database Collection in 2009 , 2008, Nucleic Acids Res..

[30]  Gerd Stumme,et al.  Formal Concept Analysis , 2009, Handbook on Ontologies.