Semi-automatic terminology ontology learning based on topic modeling

Abstract Ontologies provide features like a common vocabulary, reusability, machine-readable content, and also allows for semantic search, facilitate agent interaction and ordering & structuring of knowledge for the Semantic Web (Web 3.0) application. However, the challenge in ontology engineering is automatic learning, i.e., the there is still a lack of fully automatic approach from a text corpus or dataset of various topics to form ontology using machine learning techniques. In this paper, two topic modeling algorithms are explored, namely LSI & SVD and Mr.LDA for learning topic ontology. The objective is to determine the statistical relationship between document and terms to build a topic ontology and ontology graph with minimum human intervention. Experimental analysis on building a topic ontology and semantic retrieving corresponding topic ontology for the user's query demonstrating the effectiveness of the proposed approach.

[1]  Paul G. Young Cross-Language Information Retrieval Using Latent Semantic Indexing , 1994 .

[2]  R. Doyle The American terrorist. , 2001, Scientific American.

[3]  N. F. Noy,et al.  Ontology Development 101: A Guide to Creating Your First Ontology , 2001 .

[4]  Samir Amir,et al.  CEDAR: Efficient Reasoning for the Semantic Web , 2014, 2014 Tenth International Conference on Signal-Image Technology and Internet-Based Systems.

[5]  Nengfu Xie,et al.  Text Segmentation Model Based LDA and Ontology for Question Answering in Agriculture , 2014 .

[6]  Dragan Gasevic,et al.  Ontologies for Effective Use of Context in e-Learning Settings , 2007, J. Educ. Technol. Soc..

[7]  Ata Kabán,et al.  On an equivalence between PLSI and LDA , 2003, SIGIR.

[8]  Steffen Staab,et al.  OntoEdit: Collaborative Ontology Development for the Semantic Web , 2002, SEMWEB.

[9]  Mohammed Bennamoun,et al.  Ontology learning from text: A look back and into the future , 2012, CSUR.

[10]  Huajun Chen,et al.  The Semantic Web , 2011, Lecture Notes in Computer Science.

[11]  Himabindu Lakkaraju,et al.  Exploiting Coherence for the Simultaneous Discovery of Latent Facets and associated Sentiments , 2011, SDM.

[12]  Ian Horrocks,et al.  OilEd: a Reason-able Ontology Editor for the Semantic Web , 2001, Description Logics.

[13]  Thomas Hofmann,et al.  Probabilistic latent semantic indexing , 1999, SIGIR '99.

[14]  Ora Lassila,et al.  W3c resource description framework (rdf) model and syntax specification , 1998 .

[15]  Anima Anandkumar,et al.  Tensor decompositions for learning latent variable models , 2012, J. Mach. Learn. Res..

[16]  Michael I. Jordan,et al.  Unsupervised Learning from Dyadic Data , 1998 .

[17]  Ramesh Nallapati,et al.  Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora , 2009, EMNLP.

[18]  Ke Zhai,et al.  Using Variational Inference and MapReduce to Scale Topic Modeling , 2011, ArXiv.

[19]  Jan Rupnik,et al.  Algorithms of the LDA model [REPORT] , 2013, ArXiv.

[20]  C. Michael Sperberg-McQueen,et al.  Extensible Markup Language (XML) Version 1.0 , 2000 .

[21]  Jérôme Euzenat,et al.  Ontology Matching: State of the Art and Future Challenges , 2013, IEEE Transactions on Knowledge and Data Engineering.

[22]  Marek Hatala,et al.  Ontology mappings to improve learning resource search , 2006, Br. J. Educ. Technol..

[23]  Nophadol Jekjantuk,et al.  E-learning content management: an ontology-based approach , 2007 .

[24]  Daniel M. Roy,et al.  Complexity of Inference in Latent Dirichlet Allocation , 2011, NIPS.

[25]  Dunja Mladenic,et al.  Semi-automatic Construction of Topic Ontologies , 2005, EWMF/KDO.

[26]  Max Welling,et al.  Distributed Algorithms for Topic Models , 2009, J. Mach. Learn. Res..

[27]  Thomas R. Gruber,et al.  Toward principles for the design of ontologies used for knowledge sharing? , 1995, Int. J. Hum. Comput. Stud..

[28]  Bob DuCharme,et al.  Learning SPARQL , 2013 .

[29]  Letha H. Etzkorn,et al.  Source Code Retrieval for Bug Localization Using Latent Dirichlet Allocation , 2008, 2008 15th Working Conference on Reverse Engineering.

[30]  Asunción Gómez-Pérez,et al.  Ontological Engineering: Principles, Methods, Tools and Languages , 2006, Ontologies for Software Engineering and Software Technology.

[31]  Harith Alani,et al.  Ranking Ontologies with AKTiveRank , 2006, SEMWEB.

[32]  Jos de Bruijn,et al.  Ontology Mediation, Merging, and Aligning , 2006 .

[33]  John D. Lafferty,et al.  A correlated topic model of Science , 2007, 0708.3601.

[34]  Vinay K. Chaudhri,et al.  XOL: An XML-Based Ontology Exchange Language , 2000 .

[35]  Harry Zhang,et al.  The Optimality of Naive Bayes , 2004, FLAIRS.

[36]  Zhijie Lin,et al.  Learning Ontology Automatically Using Topic Model , 2012, 2012 International Conference on Biomedical Engineering and Biotechnology.

[37]  Boris Motik,et al.  OWL 2: The next step for OWL , 2008, J. Web Semant..

[38]  Lise Getoor,et al.  A Latent Dirichlet Model for Unsupervised Entity Resolution , 2005, SDM.

[39]  Abeer Al-Arfaj,et al.  Ontology Construction from Text : Challenges and Trends , 2015 .

[40]  Ian Horrocks,et al.  FaCT++ Description Logic Reasoner: System Description , 2006, IJCAR.

[41]  Mario Piattini,et al.  Ontologies for Software Engineering and Software Technology , 2010 .

[42]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[43]  Ruoming Jin,et al.  Topic level expertise search over heterogeneous networks , 2010, Machine Learning.

[44]  Steffen Staab,et al.  Ontology Learning for the Semantic Web , 2002, IEEE Intell. Syst..

[45]  Sanjeev Arora,et al.  Computing a nonnegative matrix factorization -- provably , 2011, STOC '12.

[46]  Ziawasch Abedjan Improving RDF data with data mining , 2014, Joint Workshop of the German Research Training Groups in Computer Science.

[47]  O. P. Vyas,et al.  An ontology-based adaptive personalized e-learning system, assisted by software agents on cloud storage , 2015, Knowl. Based Syst..

[48]  Boris Motik,et al.  HermiT: A Highly-Efficient OWL Reasoner , 2008, OWLED.

[49]  Raymond Y. K. Lau,et al.  Social analytics: Learning fuzzy product ontologies for aspect-oriented sentiment analysis , 2014, Decis. Support Syst..

[50]  Fausto Giunchiglia,et al.  From Knowledge Organization to Knowledge Representation , 2014 .

[51]  Estevam R. Hruschka,et al.  Toward an Architecture for Never-Ending Language Learning , 2010, AAAI.

[52]  Jordan L. Boyd-Graber,et al.  Mr. LDA: a flexible large scale topic modeling package using variational inference in MapReduce , 2012, WWW.

[53]  Susan T. Dumais,et al.  Using Linear Algebra for Intelligent Information Retrieval , 1995, SIAM Rev..

[54]  Christopher Stewart,et al.  Topic words analysis based on LDA model , 2014, ArXiv.

[55]  Hao Wang,et al.  Semantic data mining: A survey of ontology-based approaches , 2015, Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015).

[56]  David B. Dunson,et al.  Probabilistic topic models , 2011, KDD '11 Tutorials.

[57]  Jian-Hua Yeh,et al.  Ontology Construction Based on Latent Topic Extraction in a Digital Library , 2008, ICADL.

[58]  Jiawei Han,et al.  Mining advisor-advisee relationships from research publication networks , 2010, KDD.

[59]  Sebastian Rudolph,et al.  Ontology-Based Interpretation of Keywords for Semantic Search , 2007, ISWC/ASWC.

[60]  Aidan Hogan,et al.  Exploiting RDFS and OWL for Integrating Heterogeneous, Large-Scale, Linked Data Corpora , 2011 .

[61]  Asunción Gómez-Pérez,et al.  Methodologies, tools and languages for building ontologies: Where is their meeting point? , 2003, Data Knowl. Eng..

[62]  Govind Reddy Maddi Ontology Extraction from text documents by Singular Value Decomposition ADMI 2001 , 2001 .

[63]  Tanveer A. Faruquie,et al.  Learning Dirichlet Processes from Partially Observed Groups , 2011, 2011 IEEE 11th International Conference on Data Mining.

[64]  John D. Lafferty,et al.  Dynamic topic models , 2006, ICML.

[65]  John D. Lafferty,et al.  Correlated Topic Models , 2005, NIPS.

[66]  N. Guarino,et al.  Formal Ontology in Information Systems : Proceedings of the First International Conference(FOIS'98), June 6-8, Trento, Italy , 1998 .

[67]  W. Bruce Croft,et al.  LDA-based document models for ad-hoc retrieval , 2006, SIGIR.

[68]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[69]  Dale Dzemydiene,et al.  On the development of domain ontology for distance learning course , 2008 .

[70]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[71]  Steffen Staab,et al.  Ontology Learning , 2004, Encyclopedia of Machine Learning and Data Mining.

[72]  Flora Amato,et al.  Terminological ontology learning and population using latent Dirichlet allocation , 2014, J. Vis. Lang. Comput..

[73]  Asunción Gómez-Pérez,et al.  WebODE: An Integrated Workbench for Ontology Representation, Reasoning, and Exchange , 2002, EKAW.

[74]  Zhi-Qiang Liu,et al.  Type-2 fuzzy labeled latent Dirichlet allocation for human action categorization , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[75]  Zhao Xiaodong,et al.  An Ontology Term Extracting Method Based on Latent Dirichlet Allocation , 2012, 2012 Fourth International Conference on Multimedia Information Networking and Security.

[76]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine-mediated learning.

[77]  Shuang-Hong Yang,et al.  Dimensionality Reduction and Topic Modeling: From Latent Semantic Indexing to Latent Dirichlet Allocation and Beyond , 2012, Mining Text Data.

[78]  Andrzej Bargiela,et al.  Probabilistic Topic Models for Learning Terminological Ontologies , 2010, IEEE Transactions on Knowledge and Data Engineering.

[79]  Francis R. Bach,et al.  Online Learning for Latent Dirichlet Allocation , 2010, NIPS.

[80]  Yarden Katz,et al.  Pellet: A practical OWL-DL reasoner , 2007, J. Web Semant..

[81]  Ian Horrocks,et al.  OIL in a Nutshell , 2000, EKAW.

[82]  James A. Hendler,et al.  SHOE: A Prototype Language for the Semantic Web , 2001 .

[83]  S. Cakula E-Learning Developing Using Ontological Engineering , 2013 .

[84]  Thomas R. Gruber,et al.  A translation approach to portable ontology specifications , 1993, Knowl. Acquis..

[85]  Barbara Rosario,et al.  Latent Semantic Indexing : An Overview 1 Latent Semantic Indexing : An overview INFOSYS 240 Spring 2000 Final Paper , 2001 .

[86]  Wei Li,et al.  Pachinko allocation: DAG-structured mixture models of topic correlations , 2006, ICML.

[87]  Maybin K. Muyeba,et al.  A Hybrid Approach using Ontology Similarity and Fuzzy Logic for Semantic Question Answering , 2017, ArXiv.