Automatic Non-Taxonomic Relation Extraction from Big Data in Smart City

The explosive data growth in smart city is making domain big data a hot topic for knowledge extraction. Non-taxonomic relations refer to any relations between concept pairs except the is-a relation, which is an important part of Knowledge Graph. In this paper, toward big data in smart city, we present a multi-phase correlation search framework to automatically extract non-taxonomic relations from domain documents. Different kinds of semantic information are used to improve the performance of the system. First, inspired by the works of network representation; we propose a Semantic Graph-Based method to combine structure information of semantic graph and context information of terms together for non-taxonomic relationships identification. Second, different semantic types of verb sets are extracted based on the dependency syntactic information, which are ranked to act as non-taxonomic relationship labels. Extensive experiments demonstrate the efficiency of the proposed framework. The F1 value reaches 81.4% for identification of non-taxonomic relationships. The total precision of the non-taxonomic relationship labels extraction is 73.4%, and 87.8% non-taxonomic relations can be provided with “good” labels. We hope this article can provide a useful way for domain big data knowledge extraction in smart city.

[1]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[2]  Oren Etzioni,et al.  Identifying Relations for Open Information Extraction , 2011, EMNLP.

[3]  Wanxiang Che,et al.  LTP: A Chinese Language Technology Platform , 2010, COLING.

[4]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[5]  Feng Jiang,et al.  A Data Leakage Prevention Method Based on the Reduction of Confidential and Context Terms for Smart Mobile Devices , 2018, Wirel. Commun. Mob. Comput..

[6]  David Sánchez,et al.  Learning non-taxonomic relationships from web documents for domain ontology construction , 2008, Data Knowl. Eng..

[7]  Mehrnoush Shamsfard,et al.  Learning ontologies from natural language texts , 2004, Int. J. Hum. Comput. Stud..

[8]  Jiang Guo,et al.  A General Framework for Content-enhanced Network Representation Learning , 2016, ArXiv.

[9]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[10]  Victor C. M. Leung,et al.  Toward Big Data in Green City , 2017, IEEE Communications Magazine.

[11]  Arno Scharl,et al.  Refining non-taxonomic relation labels with external structured data to support ontology learning , 2010, Data Knowl. Eng..

[12]  Zhiyuan Liu,et al.  CANE: Context-Aware Network Embedding for Relation Modeling , 2017, ACL.

[13]  Renata Vieira,et al.  Automatic Extraction of Domain Specific Non-taxonomic Relations from Portuguese Corpora , 2013, 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT).

[14]  Victor C. M. Leung,et al.  Trust-Based Communication for the Industrial Internet of Things , 2018, IEEE Communications Magazine.

[15]  Steffen Staab,et al.  The TEXT-TO-ONTO Ontology Learning Environment , 2000 .

[16]  Claire Nedellec,et al.  Corpus-Based Learning of Semantic Relations by the ILP System, Asium , 2001, Learning Language in Logic.

[17]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[18]  Chunyan Miao,et al.  A Survey of Trust and Reputation Management Systems in Wireless Communications , 2010, Proceedings of the IEEE.

[19]  Song Guo,et al.  Secure Multimedia Big Data in Trust-Assisted Sensor-Cloud for Smart City , 2017, IEEE Communications Magazine.

[20]  Paulo Novais,et al.  PARNT: A Statistic based Approach to Extract Non-Taxonomic Relationships of Ontologies from Text , 2013, 2013 10th International Conference on Information Technology: New Generations.

[21]  Qun Liu,et al.  基於《知網》的辭彙語義相似度計算 (Word Similarity Computing Based on How-net) [In Chinese] , 2002, ROCLING/IJCLCLP.

[22]  Victor C. M. Leung,et al.  Towards Pricing for Sensor-Cloud , 2020, IEEE Transactions on Cloud Computing.

[23]  Aldo Gangemi,et al.  Ontology Learning and Its Application to Automated Terminology Translation , 2003, IEEE Intell. Syst..

[24]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[25]  Yue Zhang,et al.  ZORE: A Syntax-based System for Chinese Open Relation Extraction , 2014, EMNLP.

[26]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[27]  Flora Amato,et al.  Terminological ontology learning and population using latent Dirichlet allocation , 2014, J. Vis. Lang. Comput..

[28]  Shen Su,et al.  A Privacy Preserving Scheme for Nearest Neighbor Query , 2018, Sensors.

[29]  Xiaoxia Yin,et al.  A Real-Time Correlation of Host-Level Events in Cyber Range Service for Smart Campus , 2018, IEEE Access.

[30]  Xuanjing Huang,et al.  Incorporate Group Information to Enhance Network Embedding , 2016, CIKM.

[31]  Oren Etzioni,et al.  Chinese Open Relation Extraction for Knowledge Acquisition , 2014, EACL.

[32]  Analía Amandi,et al.  Supporting the discovery and labeling of non-taxonomic relationships in ontology learning , 2009, Expert Syst. Appl..

[33]  Oren Etzioni,et al.  Open Information Extraction from the Web , 2007, CACM.

[34]  Paul Buitelaar,et al.  RelExt: A Tool for Relation Extraction from Text in Ontology Extension , 2005, SEMWEB.

[35]  Chengqi Zhang,et al.  Network Representation Learning: A Survey , 2017, IEEE Transactions on Big Data.

[36]  Xianzhi Wang,et al.  Trust architecture and reputation evaluation for internet of things , 2018, J. Ambient Intell. Humaniz. Comput..

[37]  Oren Etzioni,et al.  Open Language Learning for Information Extraction , 2012, EMNLP.

[38]  Victor C. M. Leung,et al.  Multi-Method Data Delivery for Green Sensor-Cloud , 2017, IEEE Communications Magazine.

[39]  Syed Sibte Raza Abidi,et al.  A multi-phase correlation search framework for mining non-taxonomic relations from unstructured text , 2012, Knowledge and Information Systems.

[40]  Daniel S. Weld,et al.  Open Information Extraction Using Wikipedia , 2010, ACL.

[41]  Johanna Völker,et al.  A Framework for Ontology Learning and Data-driven Change Discovery , 2005 .

[42]  Jinqiao Shi,et al.  Toward a Comprehensive Insight Into the Eclipse Attacks of Tor Hidden Services , 2019, IEEE Internet of Things Journal.