Tolerance Rough Set Based Attribute Extraction Approach for Multiple Semantic Knowledge Base Integration

In the integration of multiple semantic knowledge bases (SKBs), the inconsistence of the items or their attributes appeared in different SKBs is still an opening challenge for researchers. To address this issue, this paper presents an innovative approach which bases on extracting common class attributes and establishing unified category-attribute templates. Since the natural properties of uncertainty and vagueness of semantic analysis involved in selecting a specific attribute from numerous candidates, the tolerance rough set (TRS) techniques are applied in constructing class-attribute templates from online SKBs. The extraction of attribute is fulfilled by statistical techniques and is integrated into the TRS framework. Finally, experiments are conducted on random selected categories. Experimental results show the effectiveness of the proposed approach.

[1]  Gang Wang,et al.  Enhancing Relation Extraction by Eliciting Selectional Constraint Features from Wikipedia , 2007, NLDB.

[2]  Z. Pawlak,et al.  Reasoning about Knowledge , 1991 .

[3]  Jaroslaw Stepaniuk,et al.  Approximation Spaces in Extensions of Rough Set Theory , 1998, Rough Sets and Current Trends in Computing.

[4]  Olena Medelyan,et al.  Integrating Cyc and Wikipedia: Folksonomy meets rigorously defined common-sense , 2008, AAAI 2008.

[5]  Andrzej Skowron,et al.  Rough Sets in Knowledge Discovery 2: Applications, Case Studies, and Software Systems , 1998 .

[6]  Doug Downey,et al.  KnowItNow: Fast, Scalable Information Extraction from the Web , 2005, HLT.

[7]  J. Giles Internet encyclopaedias go head to head , 2005, Nature.

[8]  Tu Bao Ho,et al.  Nonhierarchical document clustering based on a tolerance rough set model , 2002, Int. J. Intell. Syst..

[9]  Daisy Zhe Wang,et al.  WebTables: exploring the power of tables on the web , 2008, Proc. VLDB Endow..

[10]  Antonio Toral,et al.  Exploiting Wikipedia and EuroWordNet to solve Cross-Lingual Question Answering , 2009, Inf. Sci..

[11]  Rayid Ghani,et al.  Semi-Supervised Learning of Attribute-Value Pairs from Product Descriptions , 2007, IJCAI.

[12]  Lucy Vanderwende,et al.  MindNet: Acquiring and Structuring Semantic Information from Text , 1998, COLING-ACL.

[13]  T. Y. Lin,et al.  Rough Sets and Data Mining , 1997, Springer US.

[14]  Jerzy W. Grzymala-Busse,et al.  Rough Sets , 1995, Commun. ACM.

[15]  Jeffrey P. Bigham,et al.  Organizing and Searching the World Wide Web of Facts - Step One: The One-Million Fact Extraction Challenge , 2006, AAAI.

[16]  Zdzislaw Pawlak,et al.  VAGUENESS AND UNCERTAINTY: A ROUGH SET PERSPECTIVE , 1995, Comput. Intell..

[17]  Hung Son Nguyen,et al.  A Tolerance Rough Set Approach to Clustering Web Search Results , 2004, PKDD.

[18]  Jaime G. Carbonell,et al.  Instance-Based Question Answering: A Data-Driven Approach , 2004, EMNLP.

[19]  Gang Wang,et al.  PORE: Positive-Only Relation Extraction from Wikipedia Text , 2007, ISWC/ASWC.

[20]  Wai Lam,et al.  A probabilistic approach for adapting information extraction wrappers and discovering new attributes , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[21]  Rada Mihalcea,et al.  Semantic Indexing using WordNet Senses , 2000 .

[22]  Tsau Young Lin,et al.  Rough Sets and Data Mining: Analysis of Imprecise Data , 1996 .

[23]  P. Smith,et al.  A review of ontology based query expansion , 2007, Inf. Process. Manag..

[24]  Xiaolong Wang,et al.  Mining Pinyin-to-character conversion rules from large-scale corpus: a rough set approach , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[25]  Hae-Chang Rim,et al.  Unsupervised word sense disambiguation using WordNet relatives , 2004, Comput. Speech Lang..

[26]  Benjamin Van Durme,et al.  What You Seek Is What You Get: Extraction of Class Attributes from Query Logs , 2007, IJCAI.

[27]  Maria Ruiz-Casado,et al.  From Wikipedia to Semantic Relationships: a Semi-automated Annotation Approach , 2006, SemWiki.

[28]  Sebastian Schaffert,et al.  A SEMANTIC WIKI FOR COLLABORATIVE KNOWLEDGE FORMATION , 2006 .

[29]  Eduard Hovy,et al.  Machine Translation: Interlingual Methods , 2006 .

[30]  Z. Pawlak Rough Sets: Theoretical Aspects of Reasoning about Data , 1991 .

[31]  Z. Pawlak,et al.  Rough membership functions , 1994 .

[32]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[33]  John B. Lowe,et al.  The Berkeley FrameNet Project , 1998, ACL.

[34]  Padmini Das-Gupta Rough sets and information retrieval , 1988, SIGIR '88.

[35]  Douglas B. Lenat,et al.  CYC: a large-scale investment in knowledge infrastructure , 1995, CACM.

[36]  Patrick Pantel,et al.  Automatically Labeling Semantic Classes , 2004, NAACL.

[37]  Mari Ostendorf,et al.  HLT-NAACL 2003 : Human Language Technology conference of the North American Chapter of the Association for Computational Linguistics: companion volume : short parers, student research workshop, demonstrations, tutorial abstracts : May 27 to June 1, 2003, Edmonton, Alberta, Canada , 2003 .

[38]  Andrzej Skowron,et al.  Tolerance Approximation Spaces , 1996, Fundam. Informaticae.

[39]  Ian H. Witten,et al.  Mining Domain-Specific Thesauri from Wikipedia: A Case Study , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06).

[40]  Yanyong Guan,et al.  Rough function model and rough membership function , 2008 .

[41]  Ellen M. Voorhees Evaluating Answers to Definition Questions , 2003, HLT-NAACL.

[42]  Bo Chen,et al.  Mining employment market via text block detection and adaptive cross-domain information extraction , 2009, SIGIR.

[43]  Simone Paolo Ponzetto,et al.  Deriving a Large-Scale Taxonomy from Wikipedia , 2007, AAAI.

[44]  Wong Ping Wai,et al.  A maximum entropy approach to HowNet-based Chinese word sense disambiguation , 2002, COLING 2002.

[45]  Marius Pasca,et al.  Organizing and searching the world wide web of facts -- step two: harnessing the wisdom of the crowds , 2007, WWW '07.

[46]  Salvatore Greco,et al.  Fuzzy Similarity Relation as a Basis for Rough Approximations , 1998, Rough Sets and Current Trends in Computing.

[47]  Christopher R. Johnson,et al.  Background to Framenet , 2003 .

[48]  Gerhard Weikum,et al.  Transductive Learning for Text Classification Using Explicit Knowledge Models , 2006, PKDD.

[49]  Maria Ruiz-Casado,et al.  Automatic Assignment of Wikipedia Encyclopedic Entries to WordNet Synsets , 2005, AWIC.

[50]  Maria Ruiz-Casado,et al.  Automatising the learning of lexical patterns: An application to the enrichment of WordNet by extracting semantic relationships from Wikipedia , 2007, Data Knowl. Eng..

[51]  Luis Gravano,et al.  Snowball: extracting relations from large plain-text collections , 2000, DL '00.

[52]  Simone Paolo Ponzetto,et al.  Knowledge Derived From Wikipedia For Computing Semantic Relatedness , 2007, J. Artif. Intell. Res..

[53]  Alon Y. Halevy,et al.  Semantic Integration , 2005, AI Mag..

[54]  Gerhard Weikum,et al.  YAGO: A Large Ontology from Wikipedia and WordNet , 2008, J. Web Semant..

[55]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[56]  Kentaro Torisawa,et al.  Automatic Discovery of Attribute Words from Web Documents , 2005, IJCNLP.

[57]  Ralph Grishman,et al.  Discovering Relations among Named Entities from Large Corpora , 2004, ACL.

[58]  Vasudeva Varma,et al.  An Unsupervised Approach to Product Attribute Extraction , 2009, ECIR.

[59]  Manabu Okumura,et al.  Information Extraction and Semantic Annotation of Wikipedia , 2008, Ontology Learning and Population.

[60]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[61]  Takenobu Tokunaga,et al.  The Use of WordNet in Information Retrieval , 1998, WordNet@ACL/COLING.

[62]  Maria Ruiz-Casado,et al.  Automatic Extraction of Semantic Relationships for WordNet by Means of Pattern Learning from Wikipedia , 2005, NLDB.

[63]  Ian H. Witten,et al.  Mining Meaning from Wikipedia , 2008, Int. J. Hum. Comput. Stud..