Learning Relational Features with Backward Random Walks

The path ranking algorithm (PRA) has been recently proposed to address relational classification and retrieval tasks at large scale. We describe Cor-PRA, an enhanced system that can model a larger space of relational rules, including longer relational rules and a class of first order rules with constants, while maintaining scalability. We describe and test faster algorithms for searching for these features. A key contribution is to leverage backward random walks to efficiently discover these types of rules. An empirical study is conducted on the tasks of graph-based knowledge base inference, and person named entity extraction from parsed text. Our results show that learning paths with constants improves performance on both tasks, and that modeling longer paths dramatically improves performance for the named entity extraction task.

[1]  William Yang Wang,et al.  Programming with personalized pagerank: a locally groundable first-order probabilistic logic , 2013, CIKM.

[2]  Tom M. Mitchell,et al.  Incorporating Vector Space Similarity in Random Walk Inference over Knowledge Bases , 2014, EMNLP.

[3]  Taher H. Haveliwala Topic-sensitive PageRank , 2002, IEEE Trans. Knowl. Data Eng..

[4]  Eneko Agirre,et al.  Personalizing PageRank for Word Sense Disambiguation , 2009, EACL.

[5]  Estevam R. Hruschka,et al.  Toward an Architecture for Never-Ending Language Learning , 2010, AAAI.

[6]  Tom M. Mitchell,et al.  Random Walk Inference and Learning in A Large Scale Knowledge Base , 2011, EMNLP.

[7]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[8]  Ben Taskar,et al.  Feature Generation and Selection in Multi-Relational Statistical Learning , 2007 .

[9]  Daniel S. Weld,et al.  Fine-Grained Entity Recognition , 2012, AAAI.

[10]  R. Mike Cameron-Jones,et al.  FOIL: A Midterm Report , 1993, ECML.

[11]  Ni Lao,et al.  Reading The Web with Learned Syntactic-Semantic Inference Rules , 2012, EMNLP.

[12]  Lise Getoor,et al.  Learning Probabilistic Relational Models , 1999, IJCAI.

[13]  William W. Cohen,et al.  Learning Graph Walk Based Similarity Measures for Parsed Text , 2008, EMNLP.

[14]  William W. Cohen,et al.  Adaptive graph walk-based similarity measures for parsed text , 2014, Nat. Lang. Eng..

[15]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[16]  Ondrej Kuzelka,et al.  Block-wise construction of acyclic relational features with monotone irreducibility and relevancy properties , 2009, ICML '09.

[17]  Raymond J. Mooney,et al.  First-Order Theory Revision , 1991, ML.

[18]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[19]  Pedro M. Domingos,et al.  Learning Markov Logic Networks Using Structural Motifs , 2010, ICML.

[20]  Soumen Chakrabarti,et al.  Dynamic personalized pagerank in entity-relation graphs , 2007, WWW '07.

[21]  Roberto Navigli,et al.  Entity Linking meets Word Sense Disambiguation: a Unified Approach , 2014, TACL.

[22]  Tom M. Mitchell,et al.  Improving Learning and Inference in a Large Knowledge-Base using Latent Syntactic Cues , 2013, EMNLP.

[23]  Michael J. Pazzani,et al.  A Knowledge-intensive Approach to Learning Relational Concepts , 1991, ML.

[24]  Matthew Richardson,et al.  Markov logic networks , 2006, Machine Learning.

[25]  Ni Lao,et al.  Fast query execution for retrieval models based on path-constrained random walks , 2010, KDD.

[26]  William W. Cohen,et al.  Language-Independent Set Expansion of Named Entities Using the Web , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[27]  Nada Lavrac,et al.  Propositionalization-based relational subgroup discovery with RSD , 2006, Machine Learning.

[28]  Wei Zhang,et al.  Knowledge vault: a web-scale approach to probabilistic knowledge fusion , 2014, KDD.

[29]  Ni Lao,et al.  Relational retrieval using a combination of path-constrained random walks , 2010, Machine Learning.

[30]  Alexandrin Popescul,et al.  Dynamic Feature Generation for Relational Learning , 2022 .

[31]  Ondrej Kuzelka,et al.  A Restarted Strategy for Efficient Subsumption Testing , 2008, Fundam. Informaticae.

[32]  Michèle Sebag,et al.  Tractable Induction and Classification in First Order Logic Via Stochastic Matching , 1997, IJCAI.

[33]  Falk Scholer,et al.  User performance versus precision measures for simple search tasks , 2006, SIGIR.