Analysis and Prediction of User Editing Patterns in Ontology Development Projects

The development of real-world ontologies is a complex undertaking, commonly involving a group of domain experts with different expertise that work together in a collaborative setting. These ontologies are usually large scale and have complex structures. To assist in the authoring process, ontology tools are key at making the editing process as streamlined as possible. Being able to predict confidently what the users are likely to do next as they edit an ontology will enable us to focus and structure the user interface accordingly and to facilitate more efficient interaction and information discovery. In this paper, we use data mining, specifically the association rule mining, to investigate whether we are able to predict the next editing operation that a user will make based on the change history. We simulated and evaluated continuous prediction across time using sliding window model. We used the association rule mining to generate patterns from the ontology change logs in the training window and tested these patterns on logs in the adjacent testing window. We also evaluated the impact of different training and testing window sizes on the prediction accuracies. At last, we evaluated our prediction accuracies across different user groups and different ontologies. Our results indicate that we can indeed predict the next editing operation a user is likely to make. We will use the discovered editing patterns to develop a recommendation module for our editing tools, and to design user interface components that better fit with the user editing behaviors.

[1]  Markus Strohmaier,et al.  PragmatiX: An Interactive Tool for Visualizing the Creation Process Behind Collaboratively Engineered Ontologies , 2013, Int. J. Semantic Web Inf. Syst..

[2]  John Riedl,et al.  SuggestBot: using intelligent task routing to help people find work in wikipedia , 2007, IUI '07.

[3]  Sherri de Coronado,et al.  NCI Thesaurus: A semantic model integrating cancer-related clinical and molecular information , 2007, J. Biomed. Informatics.

[4]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[5]  Das Amrita,et al.  Mining Association Rules between Sets of Items in Large Databases , 2013 .

[6]  Hao Wang,et al.  Analysis of User Editing Patterns in Ontology Development Projects , 2013, OTM Conferences.

[7]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[8]  Tania Tudorache,et al.  A Generic Ontology for Collaborative Ontology-Development Workflows , 2008, EKAW.

[9]  Markus Strohmaier,et al.  Discovering Beaten Paths in Collaborative Ontology-Engineering Projects using Markov Chains , 2014, J. Biomed. Informatics.

[10]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[11]  Judy Kay,et al.  Clustering and Sequential Pattern Mining of Online Collaborative Learning Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[12]  Susan T. Dumais,et al.  Improving Web Search Ranking by Incorporating User Behavior Information , 2019, SIGIR Forum.

[13]  Robert Stevens,et al.  Measuring the level of activity in community built bio-ontologies , 2013, J. Biomed. Informatics.

[14]  Ulrich Güntzer,et al.  Algorithms for association rule mining — a general survey and comparison , 2000, SKDD.

[15]  Markus Strohmaier,et al.  How ontologies are made: Studying the hidden social dynamics behind collaborative ontology engineering projects , 2013, J. Web Semant..

[16]  Christophe Debruyne,et al.  Towards Social Performance Indicators for Community-based Ontology Evolution , 2009 .

[17]  Catia Pesquita,et al.  Predicting the Extension of Biomedical Ontologies , 2012, PLoS Comput. Biol..

[18]  Stefan Decker,et al.  Creating Semantic Web Contents with Protégé-2000 , 2001, IEEE Intell. Syst..

[19]  Csongor Nyulas,et al.  Will Semantic Web Technologies Work for the Development of ICD-11? , 2010, SEMWEB.

[20]  Robert Stevens,et al.  Promotion of Ontological Comprehension: Exposing Terms and Metadata with Web 2.0 , 2007, CKC.

[21]  J. Blake,et al.  Creating the Gene Ontology Resource : Design and Implementation The Gene Ontology Consortium 2 , 2001 .

[22]  Csongor Nyulas,et al.  WebProtégé: A collaborative ontology editor and knowledge acquisition tool for the Web , 2013, Semantic Web.

[23]  Markus Strohmaier,et al.  Pragmatic Analysis of Crowd-Based Knowledge Production Systems with iCAT Analytics: Visualizing Changes to the ICD-11 Ontology , 2012, AAAI Spring Symposium: Wisdom of the Crowd.

[24]  Robert Stevens,et al.  OWL Pizzas: Practical Experience of Teaching OWL-DL: Common Errors & Common Patterns , 2004, EKAW.

[25]  Mark A. Musen,et al.  A Framework for Ontology Evolution in Collaborative Environments , 2006, SEMWEB.

[26]  Tania Tudorache,et al.  An analysis of collaborative patterns in large-scale ontology development projects , 2011, K-CAP '11.

[27]  Mark Levene,et al.  Data Mining of User Navigation Patterns , 1999, WEBKDD.

[28]  Erhard Rahm,et al.  OnEX: Exploring changes in life science ontologies , 2009, BMC Bioinformatics.