Integration of prostate cancer clinical data using an ontology

It is increasingly important for investigators to efficiently and effectively access, interpret, and analyze the data from diverse biological, literature, and annotation sources in a unified way. The heterogeneity of biomedical data and the lack of metadata are the primary sources of the difficulty for integration, presenting major challenges to effective search and retrieval of the information. As a proof of concept, the Prostate Cancer Ontology (PCO) is created for the development of the Prostate Cancer Information System (PCIS). PCIS is applied to demonstrate how the ontology is utilized to solve the semantic heterogeneity problem from the integration of two prostate cancer related database systems at the Fox Chase Cancer Center. As the results of the integration process, the semantic query language SPARQL is applied to perform the integrated queries across the two database systems based on PCO.

[1]  Peter M. D. Gray,et al.  Architecture of a mediator for a bioinformatics database federation , 2002, IEEE Transactions on Information Technology in Biomedicine.

[2]  Maurizio Lenzerini,et al.  Data integration: a theoretical perspective , 2002, PODS.

[3]  Laura M. Haas,et al.  Data integration through database federation , 2002, IBM Syst. J..

[4]  Sherri de Coronado,et al.  NCI Thesaurus: A semantic model integrating cancer-related clinical and molecular information , 2007, J. Biomed. Informatics.

[5]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[6]  J. Robert Beck,et al.  The Cancer Biomedical Informatics Grid (caBIG‚): An Evolving Community for Cancer Research , 2010 .

[7]  J. Horm,et al.  Socioeconomic factors and cancer incidence among blacks and whites. , 1991, Journal of the National Cancer Institute.

[8]  Thomas R. Gruber,et al.  A translation approach to portable ontology specifications , 1993, Knowl. Acquis..

[9]  Asunción Gómez-Pérez,et al.  R2O, an extensible and semantically based database-to-ontology mapping language , 2004 .

[10]  Joann J. Ordille,et al.  Querying Heterogeneous Information Sources Using Source Descriptions , 1996, VLDB.

[11]  R Haux,et al.  An integrated approach for a knowledge-based clinical workstation: architecture and experience. , 1998, Methods of information in medicine.

[12]  Gilberto Fragoso,et al.  caCORE version 3: Implementation of a model driven, service-oriented architecture for semantic interoperability , 2008, J. Biomed. Informatics.

[13]  R A Stephenson,et al.  Racial and ethnic differences in advanced-stage prostate cancer: the Prostate Cancer Outcomes Study. , 2001, Journal of the National Cancer Institute.

[14]  Michelle N. Knowlton,et al.  A PATO-compliant zebrafish screening database (MODB): management of morpholino knockdown screen information , 2008, BMC Bioinformatics.

[15]  James A. Hendler,et al.  The National Cancer Institute's Thésaurus and Ontology , 2003, J. Web Semant..

[16]  Nigel Shadbolt,et al.  Resource Description Framework (RDF) , 2009 .

[17]  M. Laclavik RDB 2 Onto : Relational Database Data to Ontology Individuals Mapping , 2006 .

[18]  I. Fleming,et al.  AJCC/TNM cancer staging, present and future , 2001, Journal of surgical oncology.

[19]  Scott Gustafson,et al.  caCORE: A common infrastructure for cancer informatics , 2003, Bioinform..

[20]  Robert Ericsson,et al.  Building Business Intelligence Applications with .Net , 2004 .

[21]  Kerry K Kakazu,et al.  The Cancer Biomedical Informatics Grid (caBIG): pioneering an expansive network of information and tools for collaborative cancer research. , 2004, Hawaii medical journal.

[22]  York Sure-Vetter,et al.  Ontology-Based Information Integration in the Automotive Industry , 2003, SEMWEB.

[23]  José L. V. Mejino,et al.  A reference ontology for biomedical informatics: the Foundational Model of Anatomy , 2003, J. Biomed. Informatics.

[24]  S. Pecorelli,et al.  History of the FIGO cancer staging system , 2008, International journal of gynaecology and obstetrics: the official organ of the International Federation of Gynaecology and Obstetrics.

[25]  B. Zhang,et al.  GeneKeyDB: A lightweight, gene-centric, relational database to support data mining environments , 2004, BMC Bioinformatics.

[26]  Richard Lenz,et al.  Information Management in Distributed Healthcare Networks , 2005, Data Management in a Connected World.

[27]  Chris F. Taylor,et al.  The MGED Ontology: a resource for semantics-based description of microarray experiments , 2006, Bioinform..

[28]  Heiner Stuckenschmidt,et al.  Ontology-Based Integration of Information - A Survey of Existing Approaches , 2001, OIS@IJCAI.

[29]  Michel Gagnon,et al.  Ontology-based integration of data sources , 2007, 2007 10th International Conference on Information Fusion.

[30]  Alejandra Cechich,et al.  An ontology approach to data integration , 2003 .

[31]  J. Baumbach,et al.  CoryneRegNet: An ontology-based data warehouse of corynebacterial transcription factors and regulatory networks , 2006, BMC Genomics.

[32]  Kei-Hoi Cheung,et al.  Using Web Ontology Language to Integrate Heterogeneous Databases in the Neurosciences , 2006, AMIA.

[33]  Marc Shapiro,et al.  Prostate cancer in black and white Americans , 2003, Cancer and Metastasis Reviews.

[34]  P. Walsh,et al.  Cancer surveillance series: interpreting trends in prostate cancer--part I: evidence of the effects of screening in recent prostate cancer incidence, mortality, and survival rates. , 2000, The Journal of urology.

[35]  Mark A. Musen,et al.  The PROMPT suite: interactive tools for ontology merging and mapping , 2003, Int. J. Hum. Comput. Stud..

[36]  C. Street,et al.  The Cancer Biomedical Informatics Grid (caBIGTM) , 2005, 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference.