A collaborative methodology for developing a semantic model for interlinking Cancer Chemoprevention linked-data sources

This paper proposes a collaborative methodology for developing semantic data models. The proposed methodology for the semantic model development follows a “meet-in-the-middle” approach. On the one hand, the concepts emerged in a bottom-up fashion from analyzing the domain and interviewing the domain experts regarding their data needs. On the other hand, it followed a top-down approach whereby existing ontologies, vocabularies and data models were analyzed and integrated with the model. The identified elements were then fed to a multiphase abstraction exercise in order to get the concepts of the model. The derived model is also evaluated and validated by domain experts. The methodology is applied on the creation of the Cancer Chemoprevention semantic model that formally defines the fundamental entities used for annotating and describing inter-connected cancer chemoprevention related data and knowledge resources on the Web. This model is meant to offer a single point of reference for biomedical researchers to search, retrieve and annotate linked cancer chemoprevention related data and web resources. The model covers four areas related to Cancer Chemoprevention: i concepts from the literature that refer to cancer chemoprevention, ii facts and resources relevant for cancer prevention, iii collections of experimental data, procedures and protocols and iv concepts to facilitate the representation of results related to virtual screening of chemopreventive agents.

[1]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[2]  N. F. Noy,et al.  Ontology Development 101: A Guide to Creating Your First Ontology , 2001 .

[3]  Mathew W. Wright,et al.  The HUGO Gene Nomenclature Committee (HGNC) , 2001, Human Genetics.

[4]  Asunción Gómez-Pérez,et al.  Ontology Evaluation , 2004, Handbook on Ontologies.

[5]  Zhisheng Huang,et al.  Linked Life Data , 2012 .

[6]  Asunción Gómez-Pérez,et al.  Validating Ontologies with OOPS! , 2012, EKAW.

[7]  Michael Uschold,et al.  Ontologies: principles, methods and applications , 1996, The Knowledge Engineering Review.

[8]  Michael Gruninger,et al.  Methodology for the Design and Evaluation of Ontologies , 1995, IJCAI 1995.

[9]  Anne E. Trefethen,et al.  Toward interoperable bioscience data , 2012, Nature Genetics.

[10]  Oscar Corcho,et al.  Methodological Guidelines for Publishing Government Linked Data , 2011 .

[11]  Elena Beisswanger,et al.  BioTop: An upper domain ontology for the life sciencesA description of its current structure, contents and interfaces to OBO ontologies , 2008, Appl. Ontology.

[12]  Maria Victoria Schneider,et al.  MINT: a Molecular INTeraction database. , 2002, FEBS letters.

[13]  Stefan Decker,et al.  Cataloguing and Linking Life Sciences LOD Cloud , 2009 .

[14]  Aldo Gangemi,et al.  Pattern-Based Ontology Design , 2012, Ontology Engineering in a Networked World.

[15]  Nuria Casellas Ontology Evaluation through Usability Measures , 2009, OTM Workshops.

[16]  D. Kell,et al.  The Kyoto Encyclopedia of Genes and Genomes—KEGG , 2000, Yeast.

[17]  H. Lowe,et al.  Understanding and using the medical subject headings (MeSH) vocabulary to perform literature searches. , 1994, JAMA.

[18]  Asunción Gómez-Pérez,et al.  Knowledge Engineering and Knowledge Management: Ontologies and the Semantic Web , 2002, Lecture Notes in Computer Science.

[19]  Asunción Gómez-Pérez,et al.  Building Legal Ontologies with METHONTOLOGY and WebODE , 2003, Law and the Semantic Web.

[20]  Rafael C. Jimenez,et al.  The IntAct molecular interaction database in 2012 , 2011, Nucleic Acids Res..

[21]  Maria C. Yang,et al.  A methodology for engineering ontology acquisition and validation , 2008, Artificial Intelligence for Engineering Design, Analysis and Manufacturing.

[22]  Lincoln Stein,et al.  Reactome: a knowledgebase of biological pathways , 2004, Nucleic Acids Res..

[23]  Anna Zhukova,et al.  Modeling sample variables with an Experimental Factor Ontology , 2010, Bioinform..

[24]  Bo Hu,et al.  Issues with evaluating and using publicly available ontologies , 2006, EON@WWW.

[25]  Gabriele Ausiello,et al.  MINT: the Molecular INTeraction database , 2006, Nucleic Acids Res..

[26]  David M. Shotton,et al.  CiTO, the Citation Typing Ontology , 2010, J. Biomed. Semant..

[27]  Alan Ruttenberg,et al.  The SWAN biomedical discourse ontology , 2008, J. Biomed. Informatics.

[28]  Ddembe Williams,et al.  A flexible approach for user evaluation of biomedical ontologies , 2008 .

[29]  M. Ashburner,et al.  The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration , 2007, Nature Biotechnology.

[30]  Jim Davies,et al.  Metadata-driven software for clinical trials , 2009, 2009 ICSE Workshop on Software Engineering in Health Care.

[31]  P. Karp,et al.  Computational prediction of human metabolic pathways from the complete human genome , 2004, Genome Biology.

[32]  Wei Ma,et al.  RxNorm: prescription for electronic drug information exchange , 2005, IT Professional.

[33]  E. Birney,et al.  Reactome: a knowledgebase of biological pathways , 2004, Nucleic Acids Research.

[34]  Kurt Sandkuhl,et al.  Towards a methodology for ontology development in small and medium-sized enterprises , 2005, IADIS AC.

[35]  Mauricio Barcellos Almeida A proposal to evaluate ontology content , 2009, Appl. Ontology.

[36]  David L. Wheeler,et al.  GenBank , 2015, Nucleic Acids Res..

[37]  Alvis Brazma,et al.  MGED standards: work in progress. , 2006, Omics : a journal of integrative biology.

[38]  Mari Carmen Suárez-Figueroa,et al.  NeOn methodology for building ontology networks: specification, scheduling and reuse , 2011, DISKI.

[39]  Marta Sabou,et al.  Ontology (Network) Evaluation , 2012, Ontology Engineering in a Networked World.

[40]  Natalie Wilson Human Protein Reference Database , 2004, Nature Reviews Genetics.

[41]  Steffen Staab,et al.  International Handbooks on Information Systems , 2013 .

[42]  Enrique Blanco,et al.  Using geneid to Identify Genes , 2002, Current protocols in bioinformatics.

[43]  Nature Genetics , 1991, Nature.

[44]  J. B. Brooke,et al.  SUS: A 'Quick and Dirty' Usability Scale , 1996 .

[45]  David S. Wishart,et al.  DrugBank: a knowledgebase for drugs, drug actions and drug targets , 2007, Nucleic Acids Res..

[46]  D. Valle,et al.  Online Mendelian Inheritance In Man (OMIM) , 2000, Human mutation.

[47]  Steffen Staab,et al.  Measuring Similarity between Ontologies , 2002, EKAW.

[48]  S. Amladi,et al.  Online Mendelian Inheritance in Man 'OMIM'. , 2003, Indian journal of dermatology, venereology and leprology.

[50]  Denis E. Corpet,et al.  Most Effective Colon Cancer Chemopreventive Agents in Rats: A Systematic Review of Aberrant Crypt Foci and Tumor Data, Ranked by Potency , 2002, Nutrition and cancer.

[51]  Gary D. Bader,et al.  BioPAX - Biological Pathways Exchange Language Level 2, Version 1.0 Documentation , 2005 .

[52]  P. Engstrom,et al.  Chemoprevention of cancer. , 1994, Current problems in cancer.

[53]  Asunción Gómez-Pérez,et al.  Ontology Engineering in a Networked World , 2012, Springer Berlin Heidelberg.

[54]  Natalie Wilson,et al.  Human Protein Reference Database , 2004, Nature Reviews Molecular Cell Biology.

[55]  Nicole Tourigny,et al.  Bio2RDF: Towards a mashup to build bioinformatics knowledge systems , 2008, J. Biomed. Informatics.

[56]  Alberto Anguita,et al.  The ACGT Master Ontology and its applications - Towards an ontology-driven cancer research and management system , 2011, J. Biomed. Informatics.

[57]  Yorick Wilks,et al.  Data Driven Ontology Evaluation , 2004, LREC.

[58]  Tanya Barrett,et al.  Gene Expression Omnibus (GEO) , 2013 .

[59]  Gary D. Bader,et al.  cPath: open source software for collecting, storing, and querying biological pathways , 2006, BMC Bioinformatics.

[60]  E. Heiss,et al.  Mechanism-based in vitro screening of potential cancer chemopreventive agents. , 2003, Mutation research.

[61]  Alexander R. Pico,et al.  WikiPathways: Pathway Editing for the People , 2008, PLoS biology.

[62]  Steffen Staab,et al.  DILIGENT: Towards a fine-grained methodology for Distributed, Loosely-controlled and evolving Engineering of oNTologies , 2004, ECAI.

[63]  D. Lindberg,et al.  Unified Medical Language System , 2020, Definitions.

[64]  Alan Ruttenberg,et al.  The OWL of Biomedical Investigations , 2008, OWLED.

[65]  Ronan Fox,et al.  Cataloguing and Linking Life Sciences LOD , 2012 .

[66]  Vassilios Peristeras,et al.  Interlinking the Social Web with Semantics , 2008, IEEE Intelligent Systems.

[67]  R. Young,et al.  Cancer prevention: past, present, and future. , 2002, Clinical cancer research : an official journal of the American Association for Cancer Research.