An ontology knowledge inspection methodology for quality assessment and continuous improvement

Abstract Ontology-learning methods were introduced in the knowledge engineering area to automatically build ontologies from natural language texts related to a domain. Despite the initial appeal of these methods, automatically generated ontologies may have errors, inconsistencies, and a poor design quality, all of which must be manually fixed, in order to maintain the validity and usefulness of automated output. In this work, we propose a methodology to assess ontologies quality (quantitatively and graphically) and to fix ontology inconsistencies minimising design defects. The proposed methodology is based on the Deming cycle and is grounded on quality standards that proved effective in the software engineering domain and present high potential to be extended to knowledge engineering quality management. This paper demonstrates that software engineering quality assessment approaches and techniques can be successfully extended and applied to the ontology-fixing and quality improvement problem. The proposed methodology was validated in a testing ontology, by ontology design quality comparison between a manually created and automatically generated ontology.

[1]  Martin Fowler,et al.  Refactoring - Improving the Design of Existing Code , 1999, Addison Wesley object technology series.

[2]  Emerson R. Murphy-Hill,et al.  Do Developers Read Compiler Error Messages? , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[3]  Nicola Guarino,et al.  An Overview of OntoClean , 2004, Handbook on Ontologies.

[4]  Elsayed A. Elsherpieny,et al.  Implementation of model for improvement (PDCA‐cycle) in dairy laboratories , 2018 .

[5]  Andrei Voronkov,et al.  PDFX: fully-automated PDF-to-XML conversion of scientific literature , 2013, ACM Symposium on Document Engineering.

[6]  A. Scipioni,et al.  A model of quality assurance and quality improvement for post-graduate medical education in Europe , 2010, Medical teacher.

[7]  Eduard H. Hovy,et al.  Layout-aware text extraction from full-text PDF of scientific articles , 2012, Source Code for Biology and Medicine.

[8]  C. Lee Giles,et al.  ParsCit: an Open-source CRF Reference String Parsing Package , 2008, LREC.

[9]  Theerayut Thongkrau,et al.  OntoPop: An Ontology Population System for the Semantic Web , 2012, IEICE Trans. Inf. Syst..

[10]  Alexander Chatzigeorgiou,et al.  JDeodorant: Identification and Removal of Type-Checking Bad Smells , 2008, 2008 12th European Conference on Software Maintenance and Reengineering.

[11]  Marko Grobelnik,et al.  A SURVEY OF ONTOLOGY EVALUATION TECHNIQUES , 2005 .

[12]  Asunción Gómez-Pérez,et al.  OOPS! (OntOlogy Pitfall Scanner!): An On-line Tool for Ontology Evaluation , 2014, Int. J. Semantic Web Inf. Syst..

[13]  Ismailcem Budak Arpinar,et al.  Ontology Evaluation and Ranking using OntoQA , 2007, International Conference on Semantic Computing (ICSC 2007).

[14]  A. Darzi,et al.  Systematic review of the application of the plan–do–study–act method to improve quality in healthcare , 2013, BMJ quality & safety.

[15]  Yang Liu,et al.  Normalization of informal text , 2014, Comput. Speech Lang..

[16]  Ig Ibert Bittencourt,et al.  FOCA: A Methodology for Ontology Evaluation , 2016, ArXiv.

[17]  Csongor Nyulas,et al.  BioPortal: enhanced functionality via new Web services from the National Center for Biomedical Ontology to access and use ontologies in software applications , 2011, Nucleic Acids Res..

[18]  Dominika Tkaczyk,et al.  CERMINE: automatic extraction of structured metadata from scientific literature , 2015, International Journal on Document Analysis and Recognition (IJDAR).

[19]  Heike Trautmann,et al.  Building and Using an Ontology of Preference-Based Multiobjective Evolutionary Algorithms , 2017, EMO.

[20]  Sonam Mittal Sonam Mittal Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto , 2013 .

[21]  Graciela Elisa Barchini,et al.  Sistemas de información: nuevos escenarios basados en ontologías , 2006 .

[22]  Yuriy Brun,et al.  Improving IDE recommendations by considering global implications of existing recommendations , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[23]  Asunción Gómez-Pérez,et al.  Ontology Evaluation , 2004, Handbook on Ontologies.

[24]  Kalina Bontcheva,et al.  Getting More Out of Biomedical Documents with GATE's Full Lifecycle Open Source Text Analytics , 2013, PLoS Comput. Biol..

[25]  Wei Li,et al.  Sprinkled semantic diffusion kernel for word sense disambiguation , 2017, Eng. Appl. Artif. Intell..

[26]  Johanna Völker,et al.  A Framework for Ontology Learning and Data-driven Change Discovery , 2005 .

[27]  Thomas R. Gruber,et al.  A translation approach to portable ontology specifications , 1993, Knowl. Acquis..

[28]  Wei Gao,et al.  Leave-two-out stability of ontology learning algorithm , 2016 .

[29]  Yong Shi,et al.  DWWP: Domain-specific new words detection and word propagation system for sentiment analysis in the tourism domain , 2018, Knowl. Based Syst..

[30]  J. Boegh,et al.  A New Standard for Quality Requirements , 2008, IEEE Software.

[31]  Veda C. Storey,et al.  Evaluating Domain Ontologies , 2019, ACM Comput. Surv..

[32]  Anita Burgun-Parenthoine,et al.  An ontological analysis of medical Bayesian indicators of performance , 2017, Journal of Biomedical Semantics.

[33]  René Witte,et al.  Flexible Ontology Population from Text: The OwlExporter , 2010, LREC.

[34]  Maria Liakata,et al.  Semantic Annotation of Papers: Interface & Enrichment Tool (SAPIENT) , 2009, BioNLP@HLT-NAACL.

[35]  Yuriy Brun,et al.  Speculative analysis of integrated development environment recommendations , 2012, OOPSLA '12.

[36]  Daqing Hou,et al.  Studying the evolution of the Eclipse Java editor , 2007, eclipse '07.

[37]  Debajyoti Mukhopadhyay,et al.  Levenshtein Distance Technique in Dictionary Lookup Methods: An Improved Approach , 2011, ArXiv.

[38]  S. Rita,et al.  Mechanics of How to Apply Deming's PDCA Cycle to Management Education , 2009 .

[39]  Sidi Mohamed Benslimane,et al.  FOEval: Full ontology evaluation , 2011, 2011 7th International Conference on Natural Language Processing and Knowledge Engineering.

[40]  Agnieszka Konys,et al.  Knowledge systematization for ontology learning methods , 2018, KES.

[41]  Tom Mens,et al.  A survey of software refactoring , 2004, IEEE Transactions on Software Engineering.

[42]  Asunción Gómez-Pérez,et al.  A Double Classification of Common Pitfalls in Ontologies , 2010 .

[43]  Mark A. Musen,et al.  AgroPortal: A vocabulary and ontology repository for agronomy , 2018, Comput. Electron. Agric..

[44]  Jong Wook Kim,et al.  CDIP: Collection-Driven, yet Individuality-Preserving Automated Blog Tagging , 2007 .

[45]  Claudio Giuliano,et al.  Wikipedia-based WSD for multilingual frame annotation , 2013, Artif. Intell..

[46]  Sean Bechhofer,et al.  OWL: Web Ontology Language , 2009, Encyclopedia of Database Systems.

[47]  Michel Dumontier,et al.  Evaluation of the OQuaRE framework for ontology quality , 2013, Expert Syst. Appl..

[48]  Johanna Ullrich,et al.  From Hack to Elaborate Technique—A Survey on Binary Rewriting , 2019, ACM Comput. Surv..

[49]  A. Gomez-Perez,et al.  Some ideas and examples to evaluate ontologies , 1995, Proceedings the 11th Conference on Artificial Intelligence for Applications.

[50]  Amit P. Sheth,et al.  OntoQA: Metric-Based Ontology Quality Analysis , 2005 .

[51]  Bahar Sateli,et al.  The LODeXporter: Flexible Generation of Linked Open Data Triples from NLP Frameworks for Automatic Knowledge Base Construction , 2018, LREC.

[52]  Wei Gao,et al.  Partial multi-dividing ontology learning algorithm , 2018, Inf. Sci..

[53]  Waqar Mahmood,et al.  A survey of ontology learning techniques and applications , 2018, Database J. Biol. Databases Curation.

[54]  Robert Bembenik,et al.  Methods and Tools for Ontology Building, Learning and Integration - Application in the SYNAT Project , 2012, Intelligent Tools for Building a Scientific Information Platform.

[55]  T. V. Geetha,et al.  Unsupervised Domain Ontology Learning from Text , 2016, MIKE.

[56]  Mark A. Musen,et al.  The protégé project: a look back and a look forward , 2015, SIGAI.

[57]  Kalina Bontcheva,et al.  Evolving GATE to meet new challenges in language engineering , 2004, Natural Language Engineering.

[58]  M. Ashburner,et al.  The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration , 2007, Nature Biotechnology.

[59]  Asunción Gómez-Pérez,et al.  The Integration of OntoClean in WebODE , 2002, EON.