The 3rd DBCLS BioHackathon: improving life science data integration with Semantic Web technologies

BackgroundBioHackathon 2010 was the third in a series of meetings hosted by the Database Center for Life Sciences (DBCLS) in Tokyo, Japan. The overall goal of the BioHackathon series is to improve the quality and accessibility of life science research data on the Web by bringing together representatives from public databases, analytical tool providers, and cyber-infrastructure researchers to jointly tackle important challenges in the area of in silico biological research.ResultsThe theme of BioHackathon 2010 was the 'Semantic Web', and all attendees gathered with the shared goal of producing Semantic Web data from their respective resources, and/or consuming or interacting those data using their tools and interfaces. We discussed on topics including guidelines for designing semantic data and interoperability of resources. We consequently developed tools and clients for analysis and visualization.ConclusionWe provide a meeting report from BioHackathon 2010, in which we describe the discussions, decisions, and breakthroughs made as we moved towards compliance with Semantic Web technologies - from source provider, through middleware, to the end-consumer.

Fumikazu Konishi | Hammad Afzal | Gos Micklem | Yu Lin | Kiyoshi Asai | Andrea Splendiani | Hideaki Sugawara | Anna-Lena Lamprecht | Chisato Yamasaki | Yasunori Yamamoto | Atsuko Yamaguchi | Toshihisa Takagi | Matthias Samwald | Rutger A Vos | Mark D. Wilkinson | Pjotr Prins | Keiichiro Ono | Eli Kaminuma | Kazuhiro Hayashi | Shuichi Kawashima | Kunihiro Nishimura | Hong-Woo Chun | Pierre Lindenbaum | Kozo Nishida | Toshiaki Katayama | Hideya Kawaji | Kazuharu Arakawa | Alberto Labarga | Jan Aerts | Taro L. Saito | Erick Antezana | Jerven T. Bolleman | Young Joo Kim | Heiko Horn | Soichi Ogishima | Brad Chapman | Shinobu Okamoto | Bruno Aranda | James Taylor | Mark D Wilkinson | Richard Smith | Mitsuteru Nakao | Tatsuya Nishizawa | Jerven Bolleman | Tore Eriksson | Yasumasa Shigemoto | Naohisa Goto | Shoko Kawamoto | Akira R Kinjo | Kenta Oouchida | Kyung-Hoon Kwon | Paul MK Gordon | Luke McCarthy | Arek Kasprzyk | Keun-Joon Park | Nobuhiro Kido | Venkata P Satagopam | Ryosuke Ishiwata | Young Joo Kim | Katsuhiko Murakami | Koji Nagao | Kazuki Oshita | Francois Belleau | Raoul JP Bonnal | Peter JA Cock | Hideyuki Morita | Taro L Saito | David Withers | Christian M Zmasek | Kosaku Okubo | J. Aerts | Kunihiro Nishimura | Yasunori Yamamoto | T. Takagi | H. Sugawara | A. Splendiani | K. Ono | K. Murakami | K. Asai | H. Horn | A. Kasprzyk | H. Kawaji | R. Vos | A. Labarga | V. Satagopam | Bruno Aranda | M. Samwald | A. Kinjo | G. Micklem | Toshiaki Katayama | S. Kawashima | K. Arakawa | Nobuhiro Kido | Kazuki Oshita | P. Gordon | C. Zmasek | P. Prins | P. Cock | Naohisa Goto | K. Okubo | E. Kaminuma | M. Nakao | Chisato Yamasaki | Kozo Nishida | S. Ogishima | H. Chun | R. Ishiwata | S. Okamoto | Keun-Joon Park | Atsuko Yamaguchi | K. Nagao | R. Bonnal | Tatsuya Nishizawa | K. Kwon | H. Afzal | Anna-Lena Lamprecht | P. Lindenbaum | Fumikazu Konishi | Erick Antezana | Yasumasa Shigemoto | S. Kawamoto | Brad A. Chapman | K. Hayashi | K. Oouchida | F. Belleau | T. Eriksson | Luke McCarthy | H. Morita | Richard S. Smith | James Taylor | D. Withers | Yu Lin | Mitsuteru Nakao

[1]  Sergio Contrino,et al.  InterMine: a flexible data warehouse system for the integration and analysis of heterogeneous biological data , 2012, Bioinform..

[2]  Andrea Splendiani,et al.  RDFScape: Semantic Web meets Systems Biology , 2008, BMC Bioinformatics.

[3]  中尾 光輝,et al.  KEGG(Kyoto Encyclopedia of Genes and Genomes)〔和文〕 (特集 ゲノム医学の現在と未来--基礎と臨床) -- (データベース) , 2000 .

[4]  Susumu Goto,et al.  KEGG for representation and analysis of molecular networks involving diseases and drugs , 2009, Nucleic Acids Res..

[5]  A. Halpern,et al.  The Sorcerer II Global Ocean Sampling Expedition: Northwest Atlantic through Eastern Tropical Pacific , 2007, PLoS biology.

[6]  Sophia Ananiadou,et al.  Text mining and its potential applications in systems biology. , 2006, Trends in biotechnology.

[7]  Robert Stevens,et al.  The Cell Cycle Ontology: an application ontology for the representation and integrated analysis of the cell cycle process , 2009, Genome Biology.

[8]  Michel Dumontier,et al.  Controlled vocabularies and semantics in systems biology , 2011, Molecular systems biology.

[9]  Enrico Pontelli,et al.  Initial Implementation of a Comparative Data Analysis Ontology , 2009, Evolutionary bioinformatics online.

[10]  Alan L. Rector,et al.  GALEN Ten Years On: Tasks and Supporting Tools , 2001, MedInfo.

[11]  Bartek Wilczynski,et al.  Biopython: freely available Python tools for computational molecular biology and bioinformatics , 2009, Bioinform..

[12]  Michele Magrane,et al.  UniProt Knowledgebase: a hub of integrated protein data , 2011, Database J. Biol. Databases Curation.

[13]  Thomas R. Gruber,et al.  A translation approach to portable ontology specifications , 1993 .

[14]  Gary D Bader,et al.  PSICQUIC and PSISCORE: accessing and scoring molecular interactions , 2011, Nature Methods.

[15]  A. Peterson,et al.  Biodiversity informatics: managing and applying primary biodiversity data. , 2004, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[16]  A. Nekrutenko,et al.  Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences , 2010, Genome Biology.

[17]  Mark D. Wilkinson,et al.  SADI, SHARE, and the in silico scientific method , 2010, BMC Bioinformatics.

[18]  Martin Kuiper,et al.  Biological knowledge management: the emerging role of the Semantic Web technologies , 2009, Briefings Bioinform..

[19]  Lincoln Stein,et al.  Reactome: a database of reactions, pathways and biological processes , 2010, Nucleic Acids Res..

[20]  Kent A. Spackman,et al.  SNOMED clinical terms: overview of the development process and project status , 2001, AMIA.

[21]  Livia Perfetto,et al.  MINT, the molecular interaction database: 2012 update , 2011, Nucleic Acids Res..

[22]  angesichts der Corona-Pandemie,et al.  UPDATE , 1973, The Lancet.

[23]  Tim Berners-Lee,et al.  Linked Data - The Story So Far , 2009, Int. J. Semantic Web Inf. Syst..

[24]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[25]  Mark D. Wilkinson,et al.  SADI: Semantic Automated Discovery and Integration , 2012 .

[26]  Toshihisa Takagi,et al.  TogoWS: integrated SOAP and REST APIs for interoperable bioinformatics Web services , 2010, Nucleic Acids Res..

[27]  P. Turnbaugh,et al.  Microbial ecology: Human gut microbes associated with obesity , 2006, Nature.

[28]  Tiziana Margaria,et al.  Bio-jETI: a framework for semantics-based service composition , 2009, BMC Bioinformatics.

[29]  Mark D. Wilkinson,et al.  The Semantic Automated Discovery and Integration (SADI) Web service Design-Pattern, API and Reference Implementation , 2011, J. Biomed. Semant..

[30]  Kara Dolinski,et al.  The BioGRID Interaction Database: 2011 update , 2010, Nucleic Acids Res..

[31]  Egon L. Willighagen,et al.  Linked open drug data for pharmaceutical research and development , 2011, J. Cheminformatics.

[32]  Akira R. Kinjo,et al.  Protein Data Bank Japan (PDBj): maintaining a structural data archive and resource description framework format , 2011, Nucleic Acids Res..

[33]  Sergio Contrino,et al.  modMine: flexible access to modENCODE data , 2011, Nucleic Acids Res..

[34]  Hiroaki Kitano,et al.  The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models , 2003, Bioinform..

[35]  Nicole Tourigny,et al.  Bio2RDF: Towards a mashup to build bioinformatics knowledge systems , 2008, J. Biomed. Informatics.

[36]  Daniel Rios,et al.  Ensembl 2011 , 2010, Nucleic Acids Res..

[37]  H. Kitano,et al.  A comprehensive pathway map of epidermal growth factor receptor signaling , 2005, Molecular systems biology.

[38]  D. Bessesen,et al.  Human gut microbes associated with obesity , 2007 .

[39]  Christie S. Chang,et al.  The BioGRID interaction database: 2013 update , 2012, Nucleic Acids Res..

[40]  Carole A. Goble,et al.  Taverna: a tool for building and running workflows of services , 2006, Nucleic Acids Res..

[41]  Michael Kuhn,et al.  Reflect: A practical approach to web semantics , 2010, J. Web Semant..

[42]  Mary Goldman,et al.  The UCSC Genome Browser database: extensions and updates 2013 , 2012, Nucleic Acids Res..

[43]  Bernard De Baets,et al.  BioGateway: a semantic systems biology tool for the life sciences , 2009, BMC Bioinformatics.

[44]  Damian Smedley,et al.  BioMart – biological queries made easy , 2009, BMC Genomics.

[45]  Trey Ideker,et al.  Cytoscape 2.8: new features for data integration and network visualization , 2010, Bioinform..

[46]  Roderic D. M. Page Taxonomic names, metadata, and the Semantic Web , 2006 .

[47]  Oliver Eulenstein,et al.  Triplet supertree heuristics for the tree of life , 2009, BMC Bioinformatics.

[48]  John R. Josephson,et al.  What Are They? Why Do We Need Them? , 1999 .

[49]  Mary Goldman,et al.  The UCSC Genome Browser database: extensions and updates 2011 , 2011, Nucleic Acids Res..

[50]  Gabriele Ausiello,et al.  MINT: the Molecular INTeraction database , 2006, Nucleic Acids Res..

[51]  Paul W. Sternberg,et al.  Abbreviations: ModENCODE, Model Organism Database ENCyclopedia Of DNA Elements; EST, Expressed Sequence Tag; cDNA, complementary DNA; RNASeq, RNA sequencing by 2nd generation technologies; C., Caenorhabditis; INSDC, International Nucleotide Sequence Database Collaboration , 2012 .

[52]  Julie M. Sullivan,et al.  FlyMine: an integrated database for Drosophila and Anopheles genomics , 2007, Genome Biology.

[53]  Barry Smith,et al.  Ontologies as integrative tools for plant science. , 2012, American journal of botany.

[54]  Pjotr Prins,et al.  BioRuby: bioinformatics software for the Ruby programming language , 2010, Bioinform..

[55]  Ian M. Donaldson,et al.  iRefIndex: A consolidated protein interaction database with provenance , 2008, BMC Bioinformatics.

[56]  N. Kikuchi,et al.  CellDesigner 3.5: A Versatile Modeling Tool for Biochemical Networks , 2008, Proceedings of the IEEE.

[57]  Gary D Bader,et al.  BioPAX – A community standard for pathway data sharing , 2010, Nature Biotechnology.

[58]  Rolf Apweiler,et al.  The Proteomics Standards Initiative , 2003, Proteomics.

[59]  Samina Raza Abidi,et al.  Ontology-based Modeling of Clinical Practice Guidelines: A Clinical Decision Support System for Breast Cancer Follow-up Interventions at Primary Care Settings , 2007, MedInfo.

[60]  Dietrich Rebholz-Schuhmann,et al.  Text processing through Web services: calling Whatizit , 2008, Bioinform..

[61]  Balakrishnan Chandrasekaran,et al.  What are ontologies, and why do we need them? , 1999, IEEE Intell. Syst..

[62]  Samik Ghosh,et al.  AlzPathway: a comprehensive map of signaling pathways of Alzheimer’s disease , 2012, BMC Systems Biology.

[63]  Jian Shi,et al.  Simultaneous phylogeny reconstruction and multiple sequence alignment , 2009, BMC Bioinformatics.

[64]  Sylvie Ricard-Blum,et al.  MatrixDB, the extracellular matrix interaction database , 2010, Nucleic Acids Res..

[65]  K. C. Sivakumar,et al.  Molecular dynamics simulation studies and in vitro site directed mutagenesis of avian beta-defensin Apl_AvBD2 , 2010, BMC Bioinformatics.

[66]  Hideaki Sugawara,et al.  Exploration and grading of possible genes from 183 bacterial strains by a common protocol to identification of new genes: Gene Trek in Prokaryote Space (GTPS). , 2006, DNA research : an international journal for rapid publication of reports on genes and genomes.

[67]  Hideaki Sugawara,et al.  DDBJ progress report , 2010, Nucleic Acids Res..

[68]  Alfonso Valencia,et al.  Interoperability with Moby 1.0--it's better than sharing your toothbrush! , 2008, Briefings in bioinformatics.

[69]  Jens Lehmann,et al.  RelFinder: Revealing Relationships in RDF Knowledge Bases , 2009, SAMT.

[70]  Peter Uetz,et al.  MPIDB: the microbial protein interaction database , 2008, Bioinform..