Parameter Learning of Logic Programs for Symbolic-Statistical Modeling

We propose a logical/mathematical framework for statistical parameter learning of parameterized logic programs, i.e. definite clause programs containing probabilistic facts with a parameterized distribution. It extends the traditional least Herbrand model semantics in logic programming to distribution semantics, possible world semantics with a probability distribution which is unconditionally applicable to arbitrary logic programs including ones for HMMs, PCFGs and Bayesian networks. We also propose a new EM algorithm, the graphical EM algorithm, that runs for a class of parameterized logic programs representing sequential decision processes where each decision is exclusive and independent. It runs on a new data structure called support graphs describing the logical relationship between observations and their explanations, and learns parameters by computing inside and outside probability generalized for logic programs. The complexity analysis shows that when combined with OLDT search for all explanations for observations, the graphical EM algorithm, despite its generality, has the same time complexity as existing EM algorithms, i.e. the Baum-Welch algorithm for HMMs, the Inside-Outside algorithm for PCFGs, and the one for singly connected Bayesian networks that have been developed independently in each research field. Learning experiments with PCFGs using two corpora of moderate size indicate that the graphical EM algorithm can significantly outperform the Inside-Outside algorithm.

[1]  Konstantinos Sagonas,et al.  Xsb as an Eecient Deductive Database Engine , 1994 .

[2]  Avi Pfeffer,et al.  Semantics and Inference for Recursive Probability Models , 2000, AAAI/IAAI.

[3]  Jr. Henry E. Kyburg Uncertainty logics , 1994 .

[4]  Konstantinos Sagonas,et al.  XSB as an efficient deductive database engine , 1994, SIGMOD '94.

[5]  Kees Doets,et al.  From logic to logic programming , 1994, Foundations of computing series.

[6]  John S. Breese,et al.  CONSTRUCTION OF BELIEF AND DECISION NETWORKS , 1992, Comput. Intell..

[7]  John Cocke,et al.  Probabilistic Parsing Method for Sentence Disambiguation , 1989, IWPT.

[8]  Nevin Lianwen Zhang,et al.  Exploiting Causal Independence in Bayesian Network Inference , 1996, J. Artif. Intell. Res..

[9]  Taisuke Sato,et al.  A Statistical Learning Method for Logic Programs with Distribution Semantics , 1995, ICLP.

[10]  Kiyoaki Shirai,et al.  Fast Em Learning of a Family of Pcfgs Kameya, Yoshitaka (ns Solutions) , 2022 .

[11]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[12]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[13]  Zhiyi Chi,et al.  Estimation of Probabilistic Context-Free Grammars , 1998, Comput. Linguistics.

[14]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[15]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[16]  Hisao Tamaki,et al.  OLD Resolution with Tabulation , 1986, ICLP.

[17]  Leon Sterling Efficient Tabling Mechanisms for Logic Programs , 1995 .

[18]  Bruce D'Ambrosio,et al.  Inference in Bayesian Networks , 1999, AI Mag..

[19]  J. D. Lafferty A derivation of the Inside-Outside algorithm from the EM algorithm , 1993 .

[20]  C. S. Wetherell,et al.  Probabilistic Languages: A Review and Some Open Questions , 1980, CSUR.

[21]  Stefan Riezler,et al.  Probabilistic Constraint Logic Programming , 1997, ArXiv.

[22]  Paolo Mancarella,et al.  Abductive Logic Programming , 1992, LPNMR.

[23]  Peter Haddawy,et al.  Anytime Deduction for Probabilistic Logic , 1994, Artif. Intell..

[24]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[25]  Vladimir Solmon,et al.  The estimation of stochastic context-free grammars using the Inside-Outside algorithm , 2003 .

[26]  David H. D. Warren,et al.  Definite Clause Grammars for Language Analysis - A Survey of the Formalism and a Comparison with Augmented Transition Networks , 1980, Artif. Intell..

[27]  Leon Sterling,et al.  The Art of Prolog , 1987, IEEE Expert.

[28]  David Scott Warren,et al.  Memoing for logic programs , 1992, CACM.

[29]  David A. McAllester,et al.  Effective Bayesian Inference for Stochastic Programs , 1997, AAAI/IAAI.

[30]  Joseph Y. Halpern,et al.  From Statistical Knowledge Bases to Degrees of Belief , 1996, Artif. Intell..

[31]  Zhaoyu Li,et al.  Efficient inference in Bayes networks as a combinatorial optimization problem , 1994, Int. J. Approx. Reason..

[32]  V. S. Subrahmanian,et al.  Hybrid Probabilistic Programs , 2000, J. Log. Program..

[33]  Peter A. Flach,et al.  Abduction and induction: essays on their relation and integration , 2000 .

[34]  S. Muggleton Stochastic Logic Programs , 1996 .

[35]  Peter Haddawy,et al.  Answering Queries from Context-Sensitive Probabilistic Knowledge Bases (cid:3) , 1996 .

[36]  Steve Young,et al.  Applications of stochastic context-free grammars using the Inside-Outside algorithm , 1990 .

[37]  Enrique F. Castillo,et al.  Expert Systems and Probabilistic Network Models , 1996, Monographs in Computer Science.

[38]  Mats Rooth,et al.  Valence Induction with a Head-Lexicalized PCFG , 1998, EMNLP.

[39]  Taisuke Sato,et al.  Efficient EM Learning with Tabulation for Parameterized Logic Programs , 2000, Computational Logic.

[40]  J. Baker Trainable grammars for speech recognition , 1979 .

[41]  J. Lloyd Foundations of Logic Programming , 1984, Symbolic Computation.

[42]  Mats Rooth,et al.  Inside-Outside Estimation of a Lexicalized PCFG for German , 1999, ACL.

[43]  Avi Pfeffer,et al.  Learning Probabilities for Noisy First-Order Rules , 1997, IJCAI.

[44]  I. V. Ramakrishnan,et al.  Eecient Tabling Mechanisms for Logic Programs , 1995 .

[45]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[46]  Steven P. Abney Stochastic Attribute-Value Grammars , 1996, CL.

[47]  James Cussens,et al.  Loglinear models for first-order probabilistic reasoning , 1999, UAI.

[48]  José Manuel Gutiérrez,et al.  Expert Systems and Probabiistic Network Models , 1996 .

[49]  Michael P. Wellman,et al.  Generalized Queries on Probabilistic Context-Free Grammars , 1996, AAAI/IAAI, Vol. 2.

[50]  Nils J. Nilsson,et al.  Probabilistic Logic * , 2022 .

[51]  Li-Yan Yuan,et al.  Linear tabulated resolution based on Prolog control strategy , 2000, Theory and Practice of Logic Programming.

[52]  Harrison C. White,et al.  An Anatomy Of Kinship , 1963 .

[53]  Andreas Stolcke,et al.  An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities , 1994, CL.

[54]  Thomas Lukasiewicz,et al.  Probabilistic Logic Programming , 1998, ECAI.

[55]  Laks V. S. Lakshmanan,et al.  Probabilistic Deductive Databases , 1994, ILPS.

[56]  Thomas Lukasiewicz,et al.  Probabilistic Deduction with Conditional Constraints over Basic Events , 2011, KR.

[57]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[58]  Prasad Tadepalli,et al.  Learning First-Order Acyclic Horn Programs from Entailment , 1998, ILP.

[59]  Taisuke Sato,et al.  PRISM: A Language for Symbolic-Statistical Modeling , 1997, IJCAI.

[60]  I. V. Ramakrishnan,et al.  Efficient Tabling Mechanisms for Logic Programs , 1995, ICLP.

[61]  Hisao Tamaki,et al.  Unfold/Fold Transformation of Logic Programs , 1984, ICLP.

[62]  David Poole,et al.  Probabilistic Horn Abduction and Bayesian Networks , 1993, Artif. Intell..

[63]  Glenn Carroll,et al.  Context-Sensitive Statistics For Improved Grammatical Language Models , 1994, AAAI.

[64]  Dov M. Gabbay,et al.  Handbook of Logic in Artificial Intelligence and Logic Programming: Volume 3: Nonmonotonic Reasoning and Uncertain Reasoning , 1994 .

[65]  Hiroki Arimura,et al.  Learning Acyclic First-Order Horn Sentences from Entailment , 1997, ALT.

[66]  Taisuke Sato,et al.  A Viterbi-like algorithm and EM learning for statistical abduction , 2000 .

[67]  Fernando Pereira,et al.  Inside-Outside Reestimation From Partially Bracketed Corpora , 1992, HLT.

[68]  Michel Loève,et al.  Probability Theory I , 1977 .