Mining and Explaining Relationships in Wikipedia

Mining and explaining relationships between objects are challenging tasks in the field of knowledge search. We propose a new approach for the tasks using disjoint paths formed by links in Wikipedia. To realizing this approach, we propose a naive and a generalized flow based method, and a technique of avoiding flow confluences for forcing a generalized flow to be disjoint as possible. We also apply the approach to classification of relationships. Our experiments reveal that the generalized flow based method can mine many disjoint paths important for a relationship, and the classification is effective for explaining relationships.

[1]  Peter G. Doyle,et al.  Random Walks and Electric Networks: REFERENCES , 1987 .

[2]  David R. Karger,et al.  Less is More Probabilistic Models for Retrieving Fewer Relevant Documents , 2006 .

[3]  Benjamin C. M. Fung,et al.  Hierarchical Document Clustering using Frequent Itemsets , 2003, SDM.

[4]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[5]  Xinpeng Zhang,et al.  Analysis of Implicit Relations on Wikipedia: Measuring Strength through Mining Elucidatory Objects , 2010, DASFAA.

[6]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[7]  Amit P. Sheth,et al.  Context-Aware Semantic Association Ranking , 2003, SWDB.

[8]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.

[9]  Gerhard Weikum,et al.  NAGA: Searching and Ranking Knowledge , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[10]  David K. Smith Network Flows: Theory, Algorithms, and Applications , 1994 .

[11]  Amit P. Sheth,et al.  SemRank: ranking complex relationship search results on the semantic web , 2005, WWW '05.

[12]  Marta M. B. Pascoal,et al.  A new implementation of Yen’s ranking loopless paths algorithm , 2003, 4OR.

[13]  Éva Tardos,et al.  Generalized maximum flow algorithms , 1999 .

[14]  Takahiro Hara,et al.  Wikipedia Mining for an Association Web Thesaurus Construction , 2007, WISE.

[15]  Charles L. A. Clarke,et al.  Novelty and diversity in information retrieval evaluation , 2008, SIGIR '08.

[16]  Yehuda Koren,et al.  Measuring and extracting proximity in networks , 2006, KDD '06.

[17]  Hua Li,et al.  Improving web search results using affinity graph , 2005, SIGIR '05.

[18]  Xinpeng Zhang,et al.  Mining and Explaining Relationships in Wikipedia , 2012 .

[19]  Bo Zhang,et al.  StatSnowball: a statistical approach to extracting entity relationships , 2009, WWW '09.

[20]  J. Laurie Snell,et al.  Random Walks and Electric Networks: PREFACE , 1984 .

[21]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[22]  Christos Faloutsos,et al.  Fast discovery of connection subgraphs , 2004, KDD.

[23]  Wilfred Ng,et al.  Context-Aware Object Connection Discovery in Large Graphs , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[24]  Christos Faloutsos,et al.  Center-piece subgraphs: problem definition and fast solutions , 2006, KDD '06.

[25]  Jens Lehmann,et al.  Discovering Unknown Connections - the DBpedia Relationship Finder , 2007, CSSW.