A Latent-Dirichlet-Allocation Based Extension for Domain Ontology of Enterprise's Technological Innovation

This paper proposed a method for building enterprise's technological innovation domain ontology automatically from plain text corpus based on Latent Dirichlet Allocation (LDA). The proposed method consisted of four modules: 1) introducing the seed ontology for domain of enterprise's technological innovation, 2) using Natural Language Processing (NLP) technique to preprocess the collected textual data, 3) mining domain specific terms from document collections based on LDA, 4) obtaining the relationship between the terms through the defined relevant rules. The experiments have been carried out to demonstrate the effectiveness of this method and the results indicated that many terms in domain of enterprise's technological innovation and the semantic relations between terms are discovered. The proposed method is a process of continuously cycles and iterations, that is the obtained objective ontology can be re-iterated as initial seed ontology. The constant knowledge acquisition in the domain of enterprise's technological innovation to update and perfect the initial seed ontology.

[1]  Jian-Hua Yeh,et al.  Ontology Construction Based on Latent Topic Extraction in a Digital Library , 2008, ICADL.

[2]  Thomas R. Gruber,et al.  Toward principles for the design of ontologies used for knowledge sharing? , 1995, Int. J. Hum. Comput. Stud..

[3]  Leonhard Hennig,et al.  Topic-based Multi-Document Summarization with Probabilistic Latent Semantic Analysis , 2009, RANLP.

[4]  T.J. Wang,et al.  The development of the enterprise innovation value diagnosis system with the use of systems engineering , 2011, Proceedings 2011 International Conference on System Science and Engineering.

[5]  He Fang-zhen Model of Multiple Fuzzy Synthetical Evaluation for Enterprise Technology Innovation , 2005 .

[6]  Flavius Frasincar,et al.  Domain taxonomy learning from text: The subsumption method versus hierarchical clustering , 2013, Data Knowl. Eng..

[7]  Chie-Bein Chen,et al.  Evaluating firm technological innovation capability under uncertainty , 2008 .

[8]  Siu Cheung Hui,et al.  Automatic fuzzy ontology generation for semantic Web , 2006, IEEE Transactions on Knowledge and Data Engineering.

[9]  Xiang Zhu,et al.  Trajectory Tracking Control for Seafloor Tracked Vehicle By Adaptive Neural-Fuzzy Inference System Algorithm , 2018, Int. J. Comput. Commun. Control.

[10]  Ravi kumar,et al.  Legal Documents Clustering using Latent Dirichlet Allocation , 2012 .

[11]  Yau-Hwang Kuo,et al.  Automated ontology construction for unstructured text documents , 2007, Data & Knowledge Engineering.

[12]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[13]  Yun Liu,et al.  Automatic Crack Detection and Classification Method for Subway Tunnel Safety Monitoring , 2014, Sensors.

[14]  Y. Dai,et al.  Numerical Simulation and Optimization of Oil Jet Lubrication for Rotorcraft Meshing Gears , 2018, International Journal of Simulation Modelling.

[15]  George A. Vouros,et al.  Discovering Subsumption Hierarchies of Ontology Concepts from Text Corpora , 2007 .

[16]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[17]  Flora Amato,et al.  Terminological ontology learning and population using latent Dirichlet allocation , 2014, J. Vis. Lang. Comput..

[18]  Ning Ma,et al.  A study of the relationship between competitiveness and technological innovation capability based on DEA models , 2006, Eur. J. Oper. Res..

[19]  Minna Saunila,et al.  A conceptual framework for the measurement of innovation capability and its effects , 2012 .

[20]  Ming-Ten Tsai,et al.  Using Analytic Hierarchy Process to Evaluate Organizational Innovativeness in High-Tech Industry , 2008 .

[21]  Kai Liu,et al.  An Automatic Multi-domain Thesauri Construction Method Based on LDA , 2011, 2011 10th International Conference on Machine Learning and Applications and Workshops.

[22]  Liu Qun Chinese Lexical Analysis Using Cascaded Hidden Markov Model , 2004 .

[23]  Feng Luo,et al.  Ontology construction for information selection , 2002, 14th IEEE International Conference on Tools with Artificial Intelligence, 2002. (ICTAI 2002). Proceedings..

[24]  Gilles Bisson,et al.  Designing Clustering Methods for Ontology Building - The Mo'K Workbench , 2000, ECAI Workshop on Ontology Learning.

[25]  Roger B. Bradford Efficient Discovery of New Information in Large Text Databases , 2005, ISI.

[26]  George A. Vouros,et al.  Learning subsumption hierarchies of ontology concepts from texts , 2010, Web Intell. Agent Syst..

[27]  R. B. Bradford Relationship Discovery in Large Text Collections Using Latent Semantic Indexing , 2006 .

[28]  Wenyu Zhang,et al.  Kernel mixture model for probability density estimation in Bayesian classifiers , 2018, Data Mining and Knowledge Discovery.

[29]  Chonghui Guo,et al.  Textual topic evolution analysis based on term co-occurrence: A case study on the government work report of the State Council (1954–2017) , 2017, 2017 12th International Conference on Intelligent Systems and Knowledge Engineering (ISKE).

[30]  Zeng Yueming Construction and Design of Evaluation Index System of Innovative Enterprises on Innovative Capacities , 2011 .

[31]  Chonghui Guo,et al.  CCODM: conditional co-occurrence degree matrix document representation method , 2017, Soft Computing.

[32]  Steffen Staab,et al.  Strategies for the Evaluation of Ontology Learning , 2008, Ontology Learning and Population.