Automatic Concept Extraction Based on Semantic Graphs From Big Data in Smart City

With the rapid development of smart cities, various types of sensors can rapidly collect a large amount of data, and it becomes increasingly important to discover effective knowledge and process information from massive amounts of data. Currently, in the field of knowledge engineering, knowledge graphs, especially domain knowledge graphs, play important roles and become the infrastructure of Internet knowledge-driven intelligent applications. Domain concept extraction is critical to the construction of domain knowledge graphs. Although there have been some works that have extracted concepts, semantic information has not been fully used. However, the excellent concept extraction results can be obtained by making full use of semantic information. In this article, a novel concept extraction method, Semantic Graph-Based Concept Extraction (SGCCE), is proposed. First, the similarities between terms are calculated using the word co-occurrence, the LDA topic model and Word2Vec. Then, a semantic graph of terms is constructed based on the similarities between the terms. Finally, according to the semantic graph of the terms, community detection algorithms are used to divide the terms into different communities where each community acts as a concept. In the experiments, we compare the concept extraction results that are obtained by different community detection algorithms to analyze the different semantic graphs. The experimental results show the effectiveness of our proposed method. This method can effectively use semantic information, and the results of the concept extraction are better from domain big data in smart cities.

[1]  Yuefeng Liu,et al.  Domain ontology concept extraction method based on text , 2016, 2016 IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS).

[2]  Yuh-Min Chen,et al.  Enhancement of domain ontology construction using a crystallizing approach , 2011, Expert Syst. Appl..

[3]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[4]  Shen Su,et al.  Real-Time Lateral Movement Detection Based on Evidence Reasoning Network for Edge Computing Environment , 2019, IEEE Transactions on Industrial Informatics.

[5]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[6]  Pari Delir Haghighi,et al.  CFinder: An intelligent key concept finder from text for ontology development , 2014, Expert Syst. Appl..

[7]  Mukesh Singhal,et al.  Security in wireless sensor networks , 2008, Wirel. Commun. Mob. Comput..

[8]  Jianhua Chen,et al.  Learning non-taxonomical semantic relations from domain texts , 2011, Journal of Intelligent Information Systems.

[9]  Feiliang Ren A Frequency Based Mining Method of Complex Concept Relations for Domain Ontology , 2013 .

[10]  Shen Su,et al.  Block-DEF: A secure digital evidence framework using blockchain , 2019, Inf. Sci..

[11]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Jinqiao Shi,et al.  Toward a Comprehensive Insight Into the Eclipse Attacks of Tor Hidden Services , 2019, IEEE Internet of Things Journal.

[13]  Ken-Yu Lin,et al.  Enabling the development of base domain ontology through extraction of knowledge from engineering domain handbooks , 2011, Adv. Eng. Informatics.

[14]  Mohsen Guizani,et al.  A data-driven method for future Internet route decision modeling , 2019, Future Gener. Comput. Syst..

[15]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[16]  Mohsen Guizani,et al.  An effective key management scheme for heterogeneous sensor networks , 2007, Ad Hoc Networks.

[17]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[18]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[19]  Flora Amato,et al.  Terminological ontology learning and population using latent Dirichlet allocation , 2014, J. Vis. Lang. Comput..

[20]  Rung Ching Chen,et al.  Using recursive ART network to construction domain ontology based on term frequency and inverse document frequency , 2008, Expert Syst. Appl..

[21]  Jing Li,et al.  HDSKG: Harvesting domain specific knowledge graph from content of webpages , 2017, 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[22]  Hua Xu,et al.  Finding overlapping community from social networks based on community forest model , 2016, Knowl. Based Syst..

[23]  Mao-Bin Hu,et al.  Detect overlapping and hierarchical community structure in networks , 2008, ArXiv.

[24]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[25]  Brahim Ouhbi,et al.  HCHIRSIMEX: An extended method for domain ontology learning based on conditional mutual information , 2014, 2014 Third IEEE International Colloquium in Information Science and Technology (CIST).

[26]  Zhaoquan Gu,et al.  Automatic Non-Taxonomic Relation Extraction from Big Data in Smart City , 2018, IEEE Access.

[27]  Rafael Valencia-García,et al.  A semantic role labelling-based framework for learning ontologies from Spanish documents , 2013, Expert Syst. Appl..

[28]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[29]  Soh-Khim Ong,et al.  GRAONTO: A graph-based approach for automatic construction of domain ontology , 2011, Expert Syst. Appl..

[30]  Ting Liu,et al.  Topical key concept extraction from folksonomy through graph-based ranking , 2014, Multimedia Tools and Applications.

[31]  Xiaofeng Wang,et al.  A Novel Automatic Ontology Construction Method Based on Web Data , 2014, 2014 Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing.

[32]  Ah-Hwee Tan,et al.  CRCTOL: A semantic-based domain ontology learning system , 2010, J. Assoc. Inf. Sci. Technol..

[33]  Lin Zhang,et al.  The research of concept extraction in ontology extension based on extended association rules , 2016, 2016 IEEE International Conference of Online Analysis and Computing Science (ICOACS).

[34]  Wanxiang Che,et al.  LTP: A Chinese Language Technology Platform , 2010, COLING.

[35]  Peter Willett,et al.  The limitations of term co-occurrence data for query expansion in document retrieval systems , 1991, J. Am. Soc. Inf. Sci..

[36]  Mohsen Guizani,et al.  Evaluating Reputation Management Schemes of Internet of Vehicles Based on Evolutionary Game Theory , 2019, IEEE Transactions on Vehicular Technology.

[37]  S. Asharaf,et al.  Unsupervised Concept Hierarchy Learning: A Topic Modeling Guided Approach , 2016 .

[38]  Marek Hatala,et al.  Towards open ontology learning and filtering , 2011, Inf. Syst..