论文信息 - Efficient Weight Learning for Markov Logic Networks

Efficient Weight Learning for Markov Logic Networks

Markov logic networks (MLNs) combine Markov networks and first-order logic, and are a powerful and increasingly popular representation for statistical relational learning. The state-of-the-art method for discriminative learning of MLN weights is the voted perceptron algorithm, which is essentially gradient descent with an MPE approximation to the expected sufficient statistics (true clause counts). Unfortunately, these can vary widely between clauses, causing the learning problem to be highly ill-conditioned, and making gradient descent very slow. In this paper, we explore several alternatives, from per-weight learning rates to second-order methods. In particular, we focus on two approaches that avoid computing the partition function: diagonal Newton and scaled conjugate gradient. In experiments on standard SRL datasets, we obtain order-of-magnitude speedups, or more accurate models given comparable learning times.

Pedro M. Domingos | Daniel Lowd | Daniel Lowd

[1] Martin Fodslette Møller,et al. A scaled conjugate gradient algorithm for fast supervised learning , 1993, Neural Networks.

[2] Mark Craven,et al. Relational Learning with Statistical Predicate Invention: Better Models for Hypertext , 2001, Machine Learning.

[3] Barak A. Pearlmutter. Fast Exact Multiplication by the Hessian , 1994, Neural Computation.

[4] Bart Selman,et al. A general stochastic approach to solving problems with hard and soft constraints , 1996, Satisfiability Problem: Theory and Applications.

[5] Geoffrey E. Hinton. Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[6] Matthew Richardson,et al. Markov logic networks , 2006, Machine Learning.

[7] Fernando Pereira,et al. Shallow Parsing with Conditional Random Fields , 2003, NAACL.

[8] Pedro M. Domingos,et al. Discriminative Training of Markov Logic Networks , 2005, AAAI.

[9] R. Fletcher. Practical Methods of Optimization , 1988 .

[10] J. Besag. On the Statistical Analysis of Dirty Pictures , 1986 .

[11] Matthew Richardson,et al. The Alchemy System for Statistical Relational AI: User Manual , 2007 .