BioBrowsing: Making the Most of the Data Available in Entrez

One of the most popular ways to access public biological data is using portals, like Entrez (NCBI) which allows users to navigate through the data of 34 major biological sources following cross-references. In this process, data entries are inspected one after the other and cross-references to additional data available in other sources may be followed. This navigational process may be time-consuming and may not be easily reproduced from one entry to another. Most importantly, only a few sources are initially queried, biologists do not exploit all the richness of the data provided by Entrez, and in particular they may not explore alternative source paths that provide complementary information. In this paper, we introduce BioBrowsing, a tool providing scientists with access to the data obtained when all the combinations between NCBI sources have been followed. Querying is done on-the-fly (no warehousing). As new sources and links between sources appear in Entrez, BioBrowsing has a module able to update automatically the schema used by its query engine. Finally, BioBrowsing makes it possible for users to define profiles as a way of focusing the results on users specific interests. Availability: http://bioguide-project.net/biobrowsing

[1]  H. V. Jagadish,et al.  Assisted querying using instant-response interfaces , 2007, SIGMOD '07.

[2]  Koby Crammer,et al.  Learning to create data-integrating queries , 2008, Proc. VLDB Endow..

[3]  Michael Y. Galperin The Molecular Biology Database Collection: 2008 update , 2007, Nucleic Acids Res..

[4]  Rolf Apweiler,et al.  The EBI SRS Server: Recent Developments , 2002, German Conference on Bioinformatics.

[5]  Maria-Esther Vidal,et al.  Exploiting multiple paths to express scientific queries , 2004, Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004..

[6]  Emmanuel Barillot,et al.  Selecting biomedical data sources according to user preferences , 2004, ISMB/ECCB.

[7]  Maria-Esther Vidal,et al.  Path-based Systems to Guide Scientists in the Maze of Biological Data Sources , 2006, J. Bioinform. Comput. Biol..

[8]  Peter Mork,et al.  The BioMediator System as a Tool for Integrating Biologic Databases on the Web , 2004 .

[9]  Susan B. Davidson,et al.  A User-Centric Framework for Accessing Biological Sources and Tools , 2005, DILS.

[10]  Sanjeev Khanna,et al.  Differencing Provenance in Scientific Workflows , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[11]  Susan B. Davidson,et al.  BioGuideSRS: querying multiple sources with a user-centric perspective , 2007, Bioinform..

[12]  Adriane Chapman,et al.  Making database systems usable , 2007, SIGMOD '07.

[13]  Michael Y. Galperin The Molecular Biology Database Collection: 2005 update , 2004, Nucleic Acids Res..