BioHackathon 2015: Semantics of data for life sciences and reproducible research

We report on the activities of the 2015 edition of the BioHackathon, an annual event that brings together researchers and developers from around the world to develop tools and technologies that promote the reusability of biological data. We discuss issues surrounding the representation, publication, integration, mining and reuse of biological data and metadata across a wide range of biomedical data types of relevance for the life sciences, including chemistry, genotypes and phenotypes, orthology and phylogeny, proteomics, genomics, glycomics, and metabolomics. We describe our progress to address ongoing challenges to the reusability and reproducibility of research results, and identify outstanding issues that continue to impede the progress of bioinformatics research. We share our perspective on the state of the art, continued challenges, and goals for future research and development for the life sciences Semantic Web.

Akira R. Kinjo | Núria Queralt-Rosinach | Michel Dumontier | Hongyan Wu | Tudor Groza | Jesualdo Tomás Fernández-Breis | Kevin Bretonnel Cohen | Takeshi Kawashima | Jin-Dong Kim | Benedict Paten | Yasunori Yamamoto | Atsuko Yamaguchi | Jee-Hyub Kim | Toshihisa Takagi | Eric W. Deutsch | Yuki Moriya | Mark D. Wilkinson | Shujiro Okuda | Akiyasu C. Yoshizawa | Kouji Kozaki | Pjotr Prins | Takatomo Fujisawa | Shuichi Kawashima | Takeru Nakazato | Hidemasa Bono | Naoki Nishida | Hirokazu Chiba | Ikuo Uchiyama | Tazro Ohta | Hiroshi Mori | Kozo Nishida | Rutger A. Vos | Erick Antezana | Toshiaki Katayama | Terue Takatsuki | Chih-Hsuan Wei | Kazuharu Arakawa | Masaaki Matsubara | Daisuke Shinmachi | Issaku Yamada | Tatsuya Kushida | Shin Kawano | Ryota Yamanaka | Nick Juty | Nobuyuki P. Aoki | Alexander Garcia | Jerven T. Bolleman | Bruno Vieira | Mark Thompson | Robert Hoehndorf | Yuki Naito | Evan E Bolton | Gang Fu | Attayeb Mohsen | Evan E. Bolton | Kenjiro Kosaki | Shinya Suzuki | Raoul J.P. Bonnal | Kotone Itaya | Hiroyo Nishide | Soichi Ogishima | Toshiaki Tokimatsu | Kees Burger | Naohisa Goto | Kieron Taylor | Thomas Lütteke | Jean-Luc Perret | Hiroyuki Mishima | Joe Miyamoto | Peter Amstutz | Kazutoshi Yoshitake | Masaaki Kotera | Atsushi Fukushima | Colin Hercus | Sadahiro Kumagai | Jeremy Nguyen-Xuan | Philip Prathipati | Tsuyosi Tabata | Shujiro Okuda | K. Cohen | Yasunori Yamamoto | T. Takagi | T. Groza | Hongyan Wu | B. Paten | H. Bono | H. Chiba | E. Deutsch | Chih-Hsuan Wei | R. Vos | Y. Naito | N. Juty | Masaaki Kotera | J. Fernández-breis | Jin-Dong Kim | Mark Thompson | M. Dumontier | T. Tokimatsu | A. Kinjo | K. Taylor | R. Hoehndorf | Toshiaki Katayama | S. Kawashima | A. Fukushima | K. Arakawa | P. Amstutz | N. Queralt-Rosinach | P. Prathipati | P. Prins | Naohisa Goto | T. Fujisawa | Kozo Nishida | K. Kosaki | Jeremy NguyenXuan | S. Ogishima | Jee Hyub Kim | S. Kawano | Yuki Moriya | Thomas Lütteke | K. Yoshitake | I. Uchiyama | Attayeb Mohsen | Hiroshi Mori | T. Kawashima | Kouji Kozaki | Gang Fu | Takeru Nakazato | K. Burger | Atsuko Yamaguchi | Terue Takatsuki | R. Bonnal | Hiroyo Nishide | Tazro Ohta | Erick Antezana | H. Mishima | Tatsuya Kushida | Masaaki Matsubara | Issaku Yamada | Daisuke Shinmachi | Naoki Nishida | J. Perret | Shinya Suzuki | C. Hercus | Alex García | Joe Miyamoto | Kotone Itaya | Ryota Yamanaka | Sadahiro Kumagai | T. Tabata | Bruno Vieira | A. C. Yoshizawa | Takatomo Fujisawa

[1]  Michel Dumontier,et al.  The center for expanded data annotation and retrieval , 2015, J. Am. Medical Informatics Assoc..

[2]  Laura Inés Furlong,et al.  DisGeNET: a Cytoscape plugin to visualize, integrate, search and analyze gene-disease networks , 2010, Bioinform..

[3]  Amanda Clare,et al.  The EXACT description of biomedical protocols , 2008, ISMB.

[4]  Matej Oresic,et al.  Data standards can boost metabolomics research, and if there is a will, there is a way , 2015, Metabolomics.

[5]  Robert Dale,et al.  Building applied natural language generation systems , 1997, Natural Language Engineering.

[6]  Andrea Ippolito,et al.  LESS NOISE, MORE HACKING: HOW TO DEPLOY PRINCIPLES FROM MIT'S HACKING MEDICINE TO ACCELERATE HEALTH CARE , 2014, International Journal of Technology Assessment in Health Care.

[7]  Nuno Nunes,et al.  PathVisio 3: An Extendable Pathway Analysis Toolbox , 2015, PLoS Comput. Biol..

[8]  Martin Eisenacher,et al.  The HUPO proteomics standards initiative- mass spectrometry controlled vocabulary , 2013, Database J. Biol. Databases Curation.

[9]  Julius O. B. Jacobsen,et al.  Disease insights through cross-species phenotype comparisons , 2015, Mammalian Genome.

[10]  Sadaf Aslam,et al.  Formulating a researchable question: A critical step for facilitating good clinical research , 2010, Indian journal of sexually transmitted diseases and AIDS.

[11]  Rutger A. Vos,et al.  BIO::Phylo-phyloinformatic analysis using perl , 2011, BMC Bioinformatics.

[12]  Mary Dee Harris,et al.  Building a Large-scale Commercial NLG System for an EMR , 2008, INLG.

[13]  Fumikazu Konishi,et al.  The 3rd DBCLS BioHackathon: improving life science data integration with Semantic Web technologies , 2013, J. Biomed. Semant..

[14]  Peter N. Robinson,et al.  Phenotype-driven strategies for exome prioritization of human Mendelian disease genes , 2015, Genome Medicine.

[15]  Oliver Hofmann,et al.  ISA software suite: supporting standards-compliant experimental annotation and enabling curation at the community level , 2010, Bioinform..

[16]  Guo-Qiang Zhang,et al.  Complex epilepsy phenotype extraction from narrative clinical discharge summaries , 2014, J. Biomed. Informatics.

[17]  Christoph Steinbeck,et al.  The MetaboLights repository: curation challenges in metabolomics , 2013, Database J. Biol. Databases Curation.

[18]  Luis Mendoza,et al.  PASSEL: The PeptideAtlas SRMexperiment library , 2012, Proteomics.

[19]  Meng Zhao,et al.  Epilepsy and seizure ontology: towards an epilepsy informatics infrastructure for clinical research and patient care , 2014, J. Am. Medical Informatics Assoc..

[20]  Stephen B. Johnson,et al.  A review of approaches to identifying patient phenotype cohorts using electronic health records , 2013, J. Am. Medical Informatics Assoc..

[21]  Csongor Nyulas,et al.  BioPortal: enhanced functionality via new Web services from the National Center for Biomedical Ontology to access and use ontologies in software applications , 2011, Nucleic Acids Res..

[22]  Robertson Craig,et al.  TANDEM: matching proteins with tandem mass spectra. , 2004, Bioinformatics.

[23]  Ubbo Visser,et al.  BioAssay Ontology (BAO): a semantic description of bioassays and high-throughput screening results , 2011, BMC Bioinformatics.

[24]  Jesualdo Tomás Fernández-Breis,et al.  Generation of open biomedical datasets through ontology-driven transformation and integration processes , 2016, Journal of Biomedical Semantics.

[25]  Cynthia L. Smith,et al.  The Mammalian Phenotype Ontology as a tool for annotating, analyzing and comparing phenotypic information , 2004, Genome Biology.

[26]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[27]  Biswanath Dutta,et al.  MOD: Metadata for Ontology Description and Publication , 2015, Dublin Core Conference.

[28]  Kiyoko F. Aoki-Kinoshita,et al.  UniCarbKB: building a knowledge platform for glycoproteomics , 2013, Nucleic Acids Res..

[29]  Madeleine Ernst,et al.  Mass spectrometry in plant metabolomics strategies: from analytical platforms to data acquisition and processing. , 2014, Natural product reports.

[30]  Hisashi Narimatsu,et al.  WURCS: The Web3 Unique Representation of Carbohydrate Structures , 2014, J. Chem. Inf. Model..

[31]  Stuart Weibel,et al.  The Dublin Core: A Simple Content Description Model for Electronic Resources , 2005 .

[32]  Manuel Corpas,et al.  DECIPHER: Database of Chromosomal Imbalance and Phenotype in Humans Using Ensembl Resources. , 2009, American journal of human genetics.

[33]  Minoru Kanehisa,et al.  KEGG as a reference resource for gene and protein annotation , 2015, Nucleic Acids Res..

[34]  Alejandro Rodríguez-González,et al.  Publishing FAIR Data: An Exemplar Methodology Utilizing PHI-Base , 2016, Front. Plant Sci..

[35]  Robert Lanfear,et al.  Public Data Archiving in Ecology and Evolution: How Well Are We Doing? , 2015, PLoS biology.

[36]  Camila Caldana,et al.  Mass spectrometry-based plant metabolomics: Metabolite responses to abiotic stress. , 2016, Mass spectrometry reviews.

[37]  Jesualdo Tomás Fernández-Breis,et al.  The Orthology Ontology: development and applications , 2016, Journal of Biomedical Semantics.

[38]  Leo Anthony Celi,et al.  Crowdsourcing Knowledge Discovery and Innovations in Medicine , 2014, Journal of medical Internet research.

[39]  Arthur Dalby,et al.  Description of several chemical structure file formats used by computer programs developed at Molecular Design Limited , 1992, J. Chem. Inf. Comput. Sci..

[40]  R. Henrik Nilsson,et al.  Toward a Self-Updating Platform for Estimating Rates of Speciation and Migration, Ages, and Relationships of Taxa , 2016, Systematic biology.

[41]  Enrico Pontelli,et al.  Phylotastic! Making tree-of-life knowledge accessible, reusable and convenient , 2013, BMC Bioinformatics.

[42]  Philip E. Bourne,et al.  The NIH Big Data to Knowledge (BD2K) initiative , 2015, J. Am. Medical Informatics Assoc..

[43]  Márcia M. Almeida-de-Macedo,et al.  A global approach to analysis and interpretation of metabolic data for plant natural product discovery. , 2013, Natural product reports.

[44]  Fabian Schreiber,et al.  Letter to the Editor: SeqXML and OrthoXML: standards for sequence and orthology information , 2011, Briefings Bioinform..

[45]  Sobha Lalitha Devi,et al.  An alternate approach towards meaningful lyric generation in Tamil , 2010, Proceedings of the NAACL HLT 2010 Second Workshop on Computational Approaches to Linguistic Creativity.

[46]  Aleksandra Pawlik,et al.  Enriched biodiversity data as a resource and service , 2014, Biodiversity data journal.

[47]  M. Ashburner,et al.  The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration , 2007, Nature Biotechnology.

[48]  Yaniv Erlich,et al.  Using mobile sequencers in an academic classroom , 2016, eLife.

[49]  Peter N. Robinson,et al.  Deep phenotyping for precision medicine , 2012, Human mutation.

[50]  Akira R. Kinjo,et al.  The DBCLS BioHackathon: standardization and interoperability for bioinformatics web services and workflows. The DBCLS BioHackathon Consortium* , 2010, J. Biomed. Semant..

[51]  A. Rector,et al.  Relations in biomedical ontologies , 2005, Genome Biology.

[52]  Erik L. L. Sonnhammer,et al.  InParanoid 8: orthology analysis between 273 proteomes, mostly eukaryotic , 2014, Nucleic Acids Res..

[53]  M. Lynch,et al.  The evolutionary fate and consequences of duplicate genes. , 2000, Science.

[54]  C. W. von der Lieth,et al.  LINUCS: linear notation for unique description of carbohydrate sequences. , 2001, Carbohydrate research.

[55]  Núria Queralt-Rosinach,et al.  Publishing DisGeNET as Nanopublications , 2014, bioRxiv.

[56]  Núria Queralt-Rosinach,et al.  DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes , 2015, Database J. Biol. Databases Curation.

[57]  Mark D. Wilkinson,et al.  SADI Semantic Web Services - ‚cause you can't always GET what you want! , 2009, 2009 IEEE Asia-Pacific Services Computing Conference (APSCC).

[58]  María Martín,et al.  UniProt: A hub for protein information , 2015 .

[59]  E. Ruppin,et al.  Reconstruction of Arabidopsis metabolic network models accounting for subcellular compartmentalization and tissue-specificity , 2011, Proceedings of the National Academy of Sciences.

[60]  C. Wilkerson,et al.  Chloroplast 2010: A Database for Large-Scale Phenotypic Screening of Arabidopsis Mutants1[W][OA] , 2011, Plant Physiology.

[61]  Hirokazu Chiba,et al.  Construction of an Ortholog Database Using the Semantic Web Technology for Integrative Analysis of Genomic Data , 2015, PloS one.

[62]  Enrico Pontelli,et al.  Initial Implementation of a Comparative Data Analysis Ontology , 2009, Evolutionary bioinformatics online.

[63]  Özlem Uzuner,et al.  A systematic comparison of feature space effects on disease classifier performance for phenotype identification of five diseases , 2015, J. Biomed. Informatics.

[64]  Gary D. Bader,et al.  Specifications of Standards in Systems and Synthetic Biology , 2015, J. Integr. Bioinform..

[65]  K. Bretonnel Cohen,et al.  BioHackathon series in 2011 and 2012: penetration of ontology and linked data in life science domains , 2014, J. Biomed. Semant..

[66]  Weijun Luo,et al.  Pathview: an R/Bioconductor package for pathway-based data integration and visualization , 2013, Bioinform..

[67]  Giovanni Scardoni,et al.  Metscape 2 bioinformatics tool for the analysis and visualization of metabolomics and gene expression data , 2012, Bioinform..

[68]  Philip V. Toukach,et al.  Introducing glycomics data into the Semantic Web , 2013, J. Biomed. Semant..

[69]  Gang Fu,et al.  PubChem Substance and Compound databases , 2015, Nucleic Acids Res..

[70]  Marieke Verschuuren,et al.  Implementing the European Core Health Indicators (ECHI) in the Netherlands: an overview of data availability , 2015, Archives of Public Health.

[71]  Alexander R. Pico,et al.  WikiPathways App for Cytoscape : Making biological pathways amenable to network analysis and visualization , 2018 .

[72]  Michel Dumontier,et al.  Identifying aberrant pathways through integrated analysis of knowledge in pharmacogenomics , 2012, Bioinform..

[73]  D. Huhman,et al.  Mass Spectrometry Strategies in Metabolomics* , 2011, The Journal of Biological Chemistry.

[74]  Ross D King,et al.  An ontology of scientific experiments , 2006, Journal of The Royal Society Interface.

[75]  Harrison H. Owen,et al.  Open Space Technology: A User's Guide , 1993 .

[76]  Asunción Gómez-Pérez,et al.  Ontology Metadata Vocabulary and Applications , 2005, OTM Workshops.

[77]  Z Nikoloski,et al.  Comprehensive classification and perspective for modelling photorespiratory metabolism. , 2013, Plant biology.

[78]  Dirk Kraus,et al.  Suregen-2: a shell system for the generation of clinical documents , 2003, EACL.

[79]  Damian Smedley,et al.  MouseFinder: Candidate disease genes from mouse phenotype data , 2012, Human mutation.

[80]  Jim Hunter,et al.  Automatic Generation of Textual Summaries from Neonatal Intensive Care Data , 2007, AIME.

[81]  Loriene Roy,et al.  What Is a Reference Source? , 2018, The Reference Librarian.

[82]  Erik Schultes,et al.  The FAIR Guiding Principles for scientific data management and stewardship , 2016, Scientific Data.

[83]  Robert Burke,et al.  ProteoWizard: open source software for rapid proteomics tools development , 2008, Bioinform..

[84]  Akira R. Kinjo,et al.  The 2nd DBCLS BioHackathon: interoperable bioinformatics Web services for integrated applications , 2011, J. Biomed. Semant..

[85]  Thomas Lütteke,et al.  Handling and conversion of carbohydrate sequence formats and monosaccharide notation. , 2015, Methods in molecular biology.

[86]  Damian Smedley,et al.  The Human Phenotype Ontology project: linking molecular biology and disease through phenotype data , 2014, Nucleic Acids Res..

[87]  Rafael C. Jimenez,et al.  KEGGViewer, a BioJS component to visualize KEGG Pathways , 2014, F1000Research.

[88]  Julien Cohen-Adad,et al.  Brainhack: a collaborative workshop for the open neuroscience community , 2016, GigaScience.

[89]  Rutger A. Vos,et al.  Inferring large phylogenies: The big tree problem , 2006 .

[90]  Kwanjeera Wanichthanarak,et al.  MetaMapR: pathway independent metabolomic network analysis incorporating unknowns , 2015, Bioinform..

[91]  Atsushi Fukushima,et al.  Recent Progress in the Development of Metabolome Databases for Plant Systems Biology , 2013, Front. Plant Sci..

[92]  Li Min Li,et al.  Hackathon as a way to raise awareness and foster innovation for stroke. , 2015, Arquivos de neuro-psiquiatria.

[93]  N. Shah Mining the ultimate phenome repository , 2013, Nature Biotechnology.

[94]  K. Bretonnel Cohen,et al.  Mining FDA drug labels for medical conditions , 2013, BMC Medical Informatics and Decision Making.

[95]  J. Ioannidis,et al.  Reproducibility in Science: Improving the Standard for Basic and Preclinical Research , 2015, Circulation research.

[96]  Lennart Martens,et al.  PRIDE: The proteomics identifications database , 2005, Proteomics.

[97]  P. Bork,et al.  A side effect resource to capture phenotypic effects of drugs , 2010, Molecular systems biology.

[98]  G. Gkoutos,et al.  Analysis of the human diseasome using phenotype similarity between common, genetic, and infectious diseases , 2014, Scientific Reports.

[99]  Michael J. Sanderson,et al.  R8s: Inferring Absolute Rates of Molecular Evolution, Divergence times in the Absence of a Molecular Clock , 2003, Bioinform..

[100]  Kiyoko F. Aoki-Kinoshita,et al.  GlyTouCan 1.0 – The international glycan structure repository , 2015, Nucleic Acids Res..

[101]  S. Kanaya,et al.  Integrated network analysis and effective tools in plant systems biology , 2014, Front. Plant Sci..

[102]  Eric W. Deutsch,et al.  PASSEL: The PeptideAtlas SRMexperiment library , 2012, Proteomics.

[103]  Jessica A. Turner,et al.  The Ontology for Biomedical Investigations , 2016, PloS one.

[104]  Dan Brickley,et al.  FOAF: Connecting People on the Semantic Web , 2007 .

[105]  A. Harvey Millar,et al.  The MetabolomeExpress Project: enabling web-based processing, analysis and transparent dissemination of GC/MS metabolomics datasets , 2010, BMC Bioinformatics.

[106]  Sophia Ananiadou,et al.  Using text mining techniques to extract phenotypic information from the PhenoCHF corpus , 2015, BMC Medical Informatics and Decision Making.

[107]  Gary D. Bader,et al.  Pathway Commons, a web resource for biological pathway data , 2010, Nucleic Acids Res..

[108]  Ross D. Zafonte,et al.  Healthcare Hackathons Provide Educational and Innovation Opportunities: A Case Study and Best Practice Recommendations , 2016, Journal of Medical Systems.

[109]  Ross W. Filice,et al.  Constructing a Computer-Aided Differential Diagnosis Engine from Open-Source APIs , 2016, Journal of Digital Imaging.

[110]  Philip V. Toukach,et al.  GlycoRDF: an ontology to standardize glycomics data in RDF , 2015, Bioinform..

[111]  Maria Liakata,et al.  On the formalization and reuse of scientific research , 2011, Journal of The Royal Society Interface.

[112]  Yaniv Altshuler,et al.  Glycoforum a Novel Linear Code ® Nomenclature for Complex Carbohydrates , 2022 .

[113]  Gaston H. Gonnet,et al.  The OMA orthology database in 2015: function predictions, better plant support, synteny view and other improvements , 2014, Nucleic Acids Res..

[114]  David Weininger,et al.  SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules , 1988, J. Chem. Inf. Comput. Sci..

[115]  Yoshihiro Yamanishi,et al.  KEGG OC: a large-scale automatic construction of taxonomy-based ortholog clusters , 2012, Nucleic Acids Res..

[116]  Joshua M. Stuart,et al.  The Cancer Genome Atlas Pan-Cancer analysis project , 2013, Nature Genetics.

[117]  Benjamin M. Good,et al.  Microtask Crowdsourcing for Disease Mention Annotation in PubMed Abstracts , 2014, Pacific Symposium on Biocomputing.

[118]  Paul T. Groth,et al.  Querying neXtProt nanopublications and their value for insights on sequence variants and tissue expression , 2014, J. Web Semant..

[119]  P. Bork,et al.  Drug Target Identification Using Side-Effect Similarity , 2008, Science.

[120]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[121]  Yosuke Nishimura,et al.  PIERO ontology for analysis of biochemical transformations: Effective implementation of reaction information in the IUBMB enzyme list , 2014, J. Bioinform. Comput. Biol..

[122]  Ehud Reiter,et al.  Book Reviews: Building Natural Language Generation Systems , 2000, CL.

[123]  Stephen R. Heller,et al.  InChI, the IUPAC International Chemical Identifier , 2015, Journal of Cheminformatics.

[124]  Alex Bateman,et al.  TreeFam v9: a new website, more species and orthology-on-the-fly , 2013, Nucleic Acids Res..

[125]  Peter D. Karp,et al.  The MetaCyc Database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases , 2007, Nucleic Acids Res..

[126]  Francine Berman,et al.  Building Global Infrastructure for Data Sharing and Exchange Through the Research Data Alliance , 2014, D Lib Mag..

[127]  James F. Allman,et al.  The Fossil Calibration Database-A New Resource for Divergence Dating. , 2015, Systematic biology.

[128]  Kathleen McKeown,et al.  Automatically Extracting and Representing Collocations for Language Generation , 1990, ACL.

[129]  Bernard De Baets,et al.  BioGateway: a semantic systems biology tool for the life sciences , 2009, BMC Bioinformatics.

[130]  Michel Dumontier,et al.  An evidence-based approach to identify aging-related genes in Caenorhabditis elegans , 2015, BMC Bioinformatics.

[131]  John D. Westbrook,et al.  The PDB Format, mmCIF Formats, and Other Data Formats , 2005 .

[132]  A. F. Scott,et al.  OMIM: Online Mendelian Inheritance in Man , 2002 .

[133]  Manex Agirrezabal,et al.  POS-Tag Based Poetry Generation with WordNet , 2013, ENLG.

[134]  Mark Stitt,et al.  Recommendations for Reporting Metabolite Data[W] , 2011, Plant Cell.

[135]  John D Westbrook,et al.  The PDB format, mmCIF, and other data formats. , 2003, Methods of biochemical analysis.

[136]  M. Mann,et al.  MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification , 2008, Nature Biotechnology.

[137]  Tadao Sugiura,et al.  KNApSAcK Metabolite Activity Database for retrieving the relationships between metabolites and biological activities. , 2014, Plant & cell physiology.

[138]  Uwe Scholz,et al.  PGP repository: a plant phenomics and genomics data publication infrastructure , 2016, Database J. Biol. Databases Curation.

[139]  Gautier Koscielny,et al.  Linking rare and common disease: mapping clinical disease-phenotypes to ontologies in therapeutic target validation , 2016, J. Biomed. Semant..

[140]  Jill P Mesirov,et al.  Accessible Reproducible Research , 2010, Science.

[141]  Raphael Gottardo,et al.  Orchestrating high-throughput genomic analysis with Bioconductor , 2015, Nature Methods.

[142]  Michel Dumontier,et al.  Bio2RDF Release 2: Improved Coverage, Interoperability and Provenance of Life Science Linked Data , 2013, ESWC.

[143]  Nigel Collier,et al.  Generation of Silver Standard Concept Annotations from Biomedical Texts with Special Relevance to Phenotypes , 2015, PloS one.

[144]  Kazuki Saito,et al.  Modern plant metabolomics: advanced natural product gene discoveries, improved technologies, and future prospects. , 2015, Natural product reports.

[145]  C. Lieth,et al.  GlycoCT-a unifying sequence format for carbohydrates. , 2008, Carbohydrate research.

[146]  Anna Zhukova,et al.  Modeling sample variables with an Experimental Factor Ontology , 2010, Bioinform..

[147]  Akira R. Kinjo,et al.  Protein Data Bank Japan (PDBj): maintaining a structural data archive and resource description framework format , 2011, Nucleic Acids Res..

[148]  M. Mann,et al.  Stable Isotope Labeling by Amino Acids in Cell Culture, SILAC, as a Simple and Accurate Approach to Expression Proteomics* , 2002, Molecular & Cellular Proteomics.

[149]  S. Kanaya,et al.  KEGGscape: a Cytoscape app for pathway data integration , 2014, F1000Research.

[150]  Michel Dumontier,et al.  Ontology-Based Querying with Bio2RDF’s Linked Open Data , 2013, Journal of Biomedical Semantics.

[151]  Takeru Nakazato,et al.  Experimental Design-Based Functional Mining and Characterization of High-Throughput Sequencing Data in the Sequence Read Archive , 2013, PloS one.

[152]  L. Quek,et al.  AraGEM, a Genome-Scale Reconstruction of the Primary Metabolic Network in Arabidopsis1[W] , 2009, Plant Physiology.

[153]  Kentaro Inui,et al.  Modeling Structural Topic Transitions for Automatic Lyrics Generation , 2014, PACLIC.

[154]  Michel Dumontier,et al.  Automatically exposing OpenLifeData via SADI semantic Web Services , 2014, J. Biomed. Semant..

[155]  Benjamin M. Good,et al.  A task-based approach for Gene Ontology evaluation , 2013, J. Biomed. Semant..

[156]  Gang Feng,et al.  Disease Ontology: a backbone for disease semantic integration , 2011, Nucleic Acids Res..

[157]  Jesualdo Tomás Fernández-Breis,et al.  OGO: an ontological approach for integrating knowledge about orthology , 2009, BMC Bioinformatics.

[158]  William S York,et al.  GLYDE-an expressive XML standard for the representation of glycan structure. , 2005, Carbohydrate research.

[159]  Kazuki Saito,et al.  Metabolomics for functional genomics, systems biology, and biotechnology. , 2010, Annual review of plant biology.

[160]  Núria Queralt-Rosinach,et al.  The Semanticscience Integrated Ontology (SIO) for biomedical research and knowledge discovery , 2014, J. Biomed. Semant..

[161]  Maria Jesus Martin,et al.  Big data and other challenges in the quest for orthologs , 2014, Bioinform..

[162]  Michael Darsow,et al.  ChEBI: a database and ontology for chemical entities of biological interest , 2007, Nucleic Acids Res..

[163]  Maryann E. Martone,et al.  FORCE11: Building the Future for Research Communications and e-Scholarship , 2015 .

[164]  Barend Mons,et al.  Open PHACTS: semantic interoperability for drug discovery. , 2012, Drug discovery today.

[165]  利晃 植松,et al.  「J-GLOBAL」正式版の構築 検索行動モデルから見たサービス設計とその特徴 , 2012 .

[166]  Susumu Goto,et al.  KEGG for integration and interpretation of large-scale molecular data sets , 2011, Nucleic Acids Res..

[167]  Dr. Susumu Ohno Evolution by Gene Duplication , 1970, Springer Berlin Heidelberg.

[168]  M. Snir,et al.  Big data, but are we ready? , 2011, Nature Reviews Genetics.

[169]  Silvio C. E. Tosatto,et al.  Tools and data services registry: a community effort to document bioinformatics resources , 2015, Nucleic Acids Res..

[170]  Olivier Bodenreider,et al.  The digital revolution in phenotyping , 2015, Briefings Bioinform..

[171]  Paul N. Schofield,et al.  Aber-OWL: a framework for ontology-based data access in biology , 2014, BMC Bioinformatics.

[172]  Paul N. Schofield,et al.  PhenomeNET: a whole-phenome approach to disease gene discovery , 2011, Nucleic acids research.

[173]  M. Hirai,et al.  MassBank: a public repository for sharing mass spectral data for life sciences. , 2010, Journal of mass spectrometry : JMS.

[174]  Mike Steel,et al.  Estimating the Relative Order of Speciation or Coalescence Events on a Given Phylogeny , 2006, Evolutionary bioinformatics online.

[175]  A. D. Jones,et al.  LC-MS/MS assay for protein amino acids and metabolically related compounds for large-scale screening of metabolic phenotypes. , 2007, Analytical chemistry.

[176]  Núria Queralt-Rosinach,et al.  DisGeNET-RDF: harnessing the innovative power of the Semantic Web to explore the genetic basis of diseases , 2015, bioRxiv.

[177]  Peter N. Robinson,et al.  The Human Phenotype Ontology: Semantic Unification of Common and Rare Disease , 2015, American journal of human genetics.

[178]  Andrew R. Jones,et al.  ProteomeXchange provides globally co-ordinated proteomics data submission and dissemination , 2014, Nature Biotechnology.

[179]  Michel Dumontier,et al.  The health care and life sciences community profile for dataset descriptions , 2016, PeerJ.

[180]  D. N. Perkins,et al.  Probability‐based protein identification by searching sequence databases using mass spectrometry data , 1999, Electrophoresis.

[181]  Hisashi Narimatsu,et al.  Toolboxes for a standardised and systematic study of glycans , 2014, BMC Bioinformatics.

[182]  Nigel W. Hardy,et al.  Proposed minimum reporting standards for chemical analysis , 2007, Metabolomics.

[183]  K. Parker,et al.  Multiplexed Protein Quantitation in Saccharomyces cerevisiae Using Amine-reactive Isobaric Tagging Reagents*S , 2004, Molecular & Cellular Proteomics.

[184]  Nigel W. Hardy,et al.  Mouse model phenotypes provide information about human drug targets , 2013, Bioinform..

[185]  Nigam H. Shah,et al.  Toward personalizing treatment for depression: predicting diagnosis and severity , 2014, J. Am. Medical Informatics Assoc..

[186]  David S. Wishart,et al.  MSEA: a web-based tool to identify biologically meaningful patterns in quantitative metabolomic data , 2010, Nucleic Acids Res..

[187]  Shannon M. Bell,et al.  MIPHENO: data normalization for high throughput metabolite analysis , 2012, BMC Bioinformatics.

[188]  Long Jiang,et al.  Generating Chinese Couplets using a Statistical MT Approach , 2008, COLING.

[189]  Ruben Verborgh,et al.  Interoperability and FAIRness through a novel combination of Web technologies , 2017, PeerJ Prepr..

[190]  Monte Westerfield,et al.  Linking Human Diseases to Animal Models Using Ontology-Based Phenotype Annotation , 2009, PLoS biology.

[191]  D Hüske-Kraus,et al.  Text Generation in Clinical Medicine – a Review , 2003, Methods of Information in Medicine.

[192]  David S. Wishart,et al.  MetaboAnalyst 3.0—making metabolomics more meaningful , 2015, Nucleic Acids Res..

[193]  Tatsuya Akutsu,et al.  KCaM (KEGG Carbohydrate Matcher): a software tool for analyzing the structures of carbohydrate sugar chains , 2004, Nucleic Acids Res..

[194]  Bartha Maria Knoppers,et al.  An International Framework for Data Sharing: Moving Forward with the Global Alliance for Genomics and Health. , 2016, Biopreservation and biobanking.

[195]  Ehud Reiter,et al.  Lessons from a failure: Generating tailored smoking cessation letters , 2003, Artif. Intell..

[196]  D. Fell,et al.  A Genome-Scale Metabolic Model of Arabidopsis and Some of Its Properties1[C][W] , 2009, Plant Physiology.

[197]  James Pustejovsky,et al.  Description-directed Natural Language Generation , 1985, IJCAI.

[198]  Martin Kuiper,et al.  Biological knowledge management: the emerging role of the Semantic Web technologies , 2009, Briefings Bioinform..

[199]  Staffan Persson,et al.  Co-expression tools for plant biology: opportunities for hypothesis generation and caveats. , 2009, Plant, cell & environment.

[200]  Takayuki Tohge,et al.  Metabolomic Characterization of Knockout Mutants in Arabidopsis: Development of a Metabolite Profiling Database for Knockout Mutants in Arabidopsis1[W][OPEN] , 2014, Plant Physiology.

[201]  Jon R Lorsch,et al.  Perspective: Sustaining the big-data ecosystem , 2015, Nature.

[202]  A. Barabasi,et al.  Human symptoms–disease network , 2014, Nature Communications.

[203]  Daniel R. Zerbino,et al.  Ensembl 2016 , 2015, Nucleic Acids Res..

[204]  Ryan Miller,et al.  WikiPathways: capturing the full diversity of pathway knowledge , 2015, Nucleic Acids Res..

[205]  Hirokazu Chiba,et al.  MBGD update 2015: microbial genome database for flexible ortholog analysis utilizing a diverse set of genomic data , 2014, Nucleic Acids Res..

[206]  Atsushi Fukushima,et al.  A network perspective on nitrogen metabolism from model to crop plants using integrated 'omics' approaches. , 2014, Journal of experimental botany.

[207]  Ludovic Courtès,et al.  Reproducible and User-Controlled Software Environments in HPC with Guix , 2015, Euro-Par Workshops.

[208]  Franky Franky A Rule-based Approach for Karmina Generation , 2013, HLT-NAACL.