Scaling Inference for Markov Logic with a Task-Decomposition Approach

Motivated by applications in large-scale knowledge base construction, we study the problem of scaling up a sophisticated statistical inference framework called Markov Logic Networks (MLNs). Our approach, Felix, uses the idea of Lagrangian relaxation from mathematical programming to decompose a program into smaller tasks while preserving the joint-inference property of the original MLN. The advantage is that we can use highly scalable specialized algorithms for common tasks such as classication and coreference. We propose an architecture to support Lagrangian relaxation in an RDBMS which we show enables scalable joint inference for MLNs. We empirically validate that Felix is signicantly more scalable and ecient than prior approaches to MLN inference by constructing a knowledge base from 1.8M documents as part of the TAC challenge. We show that Felix scales and achieves state-of-the-art quality numbers. In contrast, prior approaches do not scale even to a subset of the corpus that is three orders of magnitude smaller.

[1]  Helmut Schmid,et al.  Improvements in Part-of-Speech Tagging with an Application to German , 1999 .

[2]  Jennifer Chu-Carroll,et al.  Building Watson: An Overview of the DeepQA Project , 2010, AI Mag..

[3]  Gerhard Weikum,et al.  From information to knowledge: harvesting entities and relationships from web sources , 2010, PODS '10.

[4]  Surajit Chaudhuri,et al.  On the complexity of equivalence between recursive and nonrecursive Datalog programs , 1994, PODS '94.

[5]  Lise Getoor,et al.  Collective entity resolution in relational data , 2007, TKDD.

[6]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[7]  Dmitry M. Malioutov,et al.  Lagrangian Relaxation for MAP Estimation in Graphical Models , 2007, ArXiv.

[8]  Jeffrey F. Naughton,et al.  Materialized View Selection for Multidimensional Datasets , 1998, VLDB.

[9]  Fabian M. Suchanek,et al.  URDF: Efficient Reasoning in Uncertain RDF Knowledge Bases with Soft and Hard Rules , 2010 .

[10]  Pedro M. Domingos,et al.  Joint Inference in Information Extraction , 2007, AAAI.

[11]  Jeffrey D. Ullman,et al.  A survey of deductive database systems , 1995, J. Log. Program..

[12]  Sebastian Riedel Improving the Accuracy and Efficiency of MAP Inference for Markov Logic , 2008, UAI.

[13]  Ivan P. Fellegi,et al.  A Theory for Record Linkage , 1969 .

[14]  Lise Getoor,et al.  Learning Probabilistic Relational Models , 1999, IJCAI.

[15]  Georg Lausen,et al.  Parallelizing Datalog programs by generalized pivoting , 1991, PODS '91.

[16]  Jeffrey F. Naughton,et al.  Declarative Information Extraction Using Datalog with Embedded Extraction Predicates , 2007, VLDB.

[17]  Alexander M. Rush,et al.  On Dual Decomposition and Linear Programming Relaxations for Natural Language Processing , 2010, EMNLP.

[18]  Estevam R. Hruschka,et al.  Toward an Architecture for Never-Ending Language Learning , 2010, AAAI.

[19]  Jeffrey P. Bigham,et al.  Names and Similarities on the Web: Fact Extraction in the Fast Lane , 2006, ACL.

[20]  Dan Roth,et al.  Learning Based Java for Rapid Development of NLP Systems , 2010, LREC.

[21]  Jeffrey D. Ullman,et al.  Implementation of logical query languages for databases , 1985, TODS.

[22]  Matthew Richardson,et al.  Markov logic networks , 2006, Machine Learning.

[23]  Rada Chirkova,et al.  Answering queries using materialized views with minimum size , 2005, The VLDB Journal.

[24]  G. Nemhauser,et al.  Integer Programming , 2020 .

[25]  Daniel S. Weld,et al.  Using Wikipedia to bootstrap open information extraction , 2009, SGMD.

[26]  Pedro M. Domingos,et al.  Lifted First-Order Belief Propagation , 2008, AAAI.

[27]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[28]  Christopher Ré,et al.  Tuffy: Scaling up Statistical Inference in Markov Logic Networks using an RDBMS , 2011, Proc. VLDB Endow..

[29]  Pedro M. Domingos,et al.  Joint Unsupervised Coreference Resolution with Markov Logic , 2008, EMNLP.

[30]  Sebastian Riedel Cutting Plane MAP Inference for Markov Logic , 2009 .

[31]  Andrew McCallum,et al.  FACTORIE: Probabilistic Programming via Imperatively Defined Factor Graphs , 2009, NIPS.

[32]  Andrew McCallum,et al.  Bi-directional Joint Inference for Entity Resolution and Segmentation Using Imperatively-Defined Factor Graphs , 2009, ECML/PKDD.

[33]  Lise Getoor,et al.  PrDB: managing and exploiting rich correlations in probabilistic databases , 2009, The VLDB Journal.

[34]  Daisy Zhe Wang,et al.  Hybrid in-database inference for declarative information extraction , 2011, SIGMOD '11.

[35]  Ashok K. Chandra,et al.  Optimal implementation of conjunctive queries in relational data bases , 1977, STOC '77.

[36]  Kevin Chen-Chuan Chang,et al.  Searching patterns for relation extraction over the web: rediscovering the pattern-relation duality , 2011, WSDM '11.

[37]  Frederick Reiss,et al.  SystemT: An Algebraic Approach to Declarative Information Extraction , 2010, ACL.

[38]  John N. Tsitsiklis,et al.  Parallel and distributed computation , 1989 .

[39]  Frederick Reiss,et al.  Automatic rule refinement for information extraction , 2010, Proc. VLDB Endow..

[40]  S. Janson,et al.  Wiley‐Interscience Series in Discrete Mathematics and Optimization , 2011 .

[41]  Serge Abiteboul,et al.  Data functions, datalog and negation , 1988, SIGMOD '88.

[42]  Inderpal Singh Mumick,et al.  Selection of Views to Materialize in a Data Warehouse , 2005, IEEE Trans. Knowl. Data Eng..

[43]  Dan Olteanu,et al.  SPROUT: Lazy vs. Eager Query Plans for Tuple-Independent Probabilistic Databases , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[44]  William W. Cohen,et al.  Iterative Set Expansion of Named Entities Using the Web , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[45]  Gerhard Weikum,et al.  SOFIE: a self-organizing framework for information extraction , 2009, WWW '09.

[46]  Stuart J. Russell,et al.  BLOG: Probabilistic Models with Unknown Objects , 2005, IJCAI.

[47]  Anthony C. Klug On conjunctive queries containing inequalities , 1988, JACM.

[48]  Xiaojin Zhu,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence A Framework for Incorporating General Domain Knowledge into Latent Dirichlet Allocation Using First-Order Logic , 2022 .

[49]  Bo Zhang,et al.  StatSnowball: a statistical approach to extracting entity relationships , 2009, WWW '09.

[50]  Rajasekar Krishnamurthy,et al.  Uncertainty management in rule-based information extraction systems , 2009, SIGMOD Conference.

[51]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[52]  Raghu Ramakrishnan,et al.  DBLife: A Community Information Management Platform for the Database Research Community (Demo) , 2007, CIDR.

[53]  Gerhard Weikum,et al.  The YAGO-NAGA approach to knowledge discovery , 2009, SGMD.

[54]  Andrew McCallum,et al.  Scalable probabilistic databases with factor graphs and MCMC , 2010, Proc. VLDB Endow..

[55]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[56]  Jie Liu,et al.  Propagating functional dependencies with conditions , 2008, VLDB 2008.

[57]  Gerhard Weikum,et al.  Scalable knowledge harvesting with high precision and high recall , 2011, WSDM '11.

[58]  Christopher Ré,et al.  Large-Scale Deduplication with Constraints Using Dedupalog , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[59]  Heng Ji,et al.  Overview of the TAC 2010 Knowledge Base Population Track , 2010 .

[60]  Bart Selman,et al.  A general stochastic approach to solving problems with hard and soft constraints , 1996, Satisfiability Problem: Theory and Applications.