Set-based Noise Elimination for Is-a Relations in a Large-Scale Lexical Taxonomy

As the significance of knowledge base has been widely accepted during the past decade, how to efficiently eliminate the noises in the knowledge base becomes a key problem since the automatically constructed knowledge base usually contains lots of noises that disturbs its application. Based on the observation for Is-a relations that the real entities of a concept A always share several same ancestors besides A, we come up with an Is-a relations noise elimination approach. In this paper, we will elaborate on this approach and explain the pseudocode of it. Our experimental results demonstrate that such an approach is capable of eliminating the noises in the knowledge base efficiently.

[1]  Hong Chen,et al.  Parallel SimRank computation on large graphs with iterative aggregation , 2010, KDD.

[2]  Hongjun Lu,et al.  Discovering and Reconciling Semantic Conflicts: A Data Mining Perspective , 1997, DS-7.

[3]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.

[4]  John Riedl,et al.  Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[5]  Tok Wang Ling,et al.  OWL-Based Semantic Conflicts Detection and Resolution for Data Interoperability , 2004, ER.

[6]  Seung-won Hwang,et al.  Graph-Based Wrong IsA Relation Detection in a Large-Scale Lexical Taxonomy , 2017, AAAI.

[7]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[8]  Ramanathan V. Guha,et al.  Building Large Knowledge-Based Systems: Representation and Inference in the Cyc Project , 1990 .

[9]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[10]  Doug Downey,et al.  Web-scale information extraction in knowitall: (preliminary results) , 2004, WWW '04.

[11]  Simone Paolo Ponzetto,et al.  WikiTaxonomy: A Large Scale Knowledge Resource , 2008, ECAI.

[12]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[13]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[14]  Xinlei Chen,et al.  Never-Ending Learning , 2012, ECAI.

[15]  William W. Cohen,et al.  WebSets: extracting sets of entities from the web using unsupervised information extraction , 2012, WSDM '12.

[16]  Haixun Wang,et al.  Probase: a probabilistic taxonomy for text understanding , 2012, SIGMOD Conference.

[17]  Aris M. Ouksel,et al.  A classification of semantic conflicts in heterogeneous database systems , 1995, J. Organ. Comput..

[18]  Zornitsa Kozareva,et al.  A Semi-Supervised Method to Learn and Construct Taxonomies Using the Web , 2010, EMNLP.

[19]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[20]  Yizhou Sun,et al.  Fast computation of SimRank for static and dynamic information networks , 2010, EDBT '10.