Combining Subjective Probabilities and Data in Training Markov Logic Networks

Markov logic is a rich language that allows one to specify a knowledge base as a set of weighted first-order logic formulas, and to define a probability distribution over truth assignments to ground atoms using this knowledge base. Usually, the weight of a formula cannot be related to the probability of the formula without taking into account the weights of the other formulas. In general, this is not an issue, since the weights are learned from training data. However, in many domains (e.g. healthcare, dependable systems, etc.), only little or no training data may be available, but one has access to a domain expert whose knowledge is available in the form of subjective probabilities. Within the framework of Bayesian statistics, we present a formalism for using a domain expert's knowledge for weight learning. Our approach defines priors that are different from and more general than previously used Gaussian priors over weights. We show how one can learn weights in an MLN by combining subjective probabilities and training data, without requiring that the domain expert provides consistent knowledge. Additionally, we also provide a formalism for capturing conditional subjective probabilities, which are often easier to obtain and more reliable than non-conditional probabilities. We demonstrate the effectiveness of our approach by extensive experiments in a domain that models failure dependencies in a cyber-physical system. Moreover, we demonstrate the advantages of using our proposed prior over that of using non-zero mean Gaussian priors in a commonly cited social network MLN testbed.

[1]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[2]  Matthew Richardson,et al.  The Alchemy System for Statistical Relational AI: User Manual , 2007 .

[3]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[4]  Dan Geiger,et al.  Graphical Models and Exponential Families , 1998, UAI.

[5]  M. Zeelenberg,et al.  Eliciting decision weights by adapting de Finetti’s betting-odds method to prospect theory , 2007 .

[6]  Pedro M. Domingos,et al.  Sound and Efficient Inference with Probabilistic and Deterministic Dependencies , 2006, AAAI.

[7]  Gideon S. Mann,et al.  Learning from labeled features using generalized expectation criteria , 2008, SIGIR '08.

[8]  Oscar Firschein,et al.  Readings in computer vision: issues, problems, principles, and paradigms , 1987 .

[9]  Stan Z. Li,et al.  A Markov random field model for object matching under contextual constraints , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Manfred Jaeger,et al.  Proceedings of the 24th Annual International Conference on Machine Learning (ICML 2007) , 2007, ICML 2007.

[11]  Patrick Lincoln,et al.  Markov Logic Networks in Health Informatics , 2011 .

[12]  Jens Fisseler,et al.  Toward Markov Logic with Conditional Probabilities , 2008, FLAIRS.

[13]  Michael Clarke,et al.  Symbolic and Quantitative Approaches to Reasoning and Uncertainty , 1991, Lecture Notes in Computer Science.

[14]  Natarajan Shankar,et al.  Machine Reading Using Markov Logic Networks for Collective Probabilistic Inference , 2011 .

[15]  Fan Chung Graham,et al.  Chordal Completions of Planar Graphs , 1994, J. Comb. Theory, Ser. B.

[16]  Gabriele Kern-Isberner,et al.  Relational Probabilistic Conditional Reasoning at Maximum Entropy , 2011, ECSQARU.

[17]  Ashish Tiwari,et al.  META 2f: Probabilistic, Compositional, Multi-dimension Model-Based Verification (PROMISE) , 2011 .

[18]  Michael Beetz,et al.  Adaptive Markov Logic Networks: Learning Statistical Relational Models with Dynamic Parameters , 2010, ECAI.

[19]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[20]  R. Fletcher Practical Methods of Optimization , 1988 .

[21]  J. Nocedal Updating Quasi-Newton Matrices With Limited Storage , 1980 .

[22]  Qiang Ji,et al.  Constrained Maximum Likelihood Learning of Bayesian Networks for Facial Action Recognition , 2008, ECCV.

[23]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[24]  Pedro M. Domingos,et al.  Joint Unsupervised Coreference Resolution with Markov Logic , 2008, EMNLP.

[25]  Dan Klein,et al.  Learning from measurements in exponential families , 2009, ICML '09.

[26]  Roger Fletcher,et al.  Practical methods of optimization; (2nd ed.) , 1987 .

[27]  J. Shewchuk An Introduction to the Conjugate Gradient Method Without the Agonizing Pain , 1994 .

[28]  Tom M. Mitchell,et al.  Bayesian Network Learning with Parameter Constraints , 2006, J. Mach. Learn. Res..

[29]  H. Raiffa,et al.  Applied Statistical Decision Theory. , 1961 .

[30]  A. Hasman,et al.  Probabilistic reasoning in intelligent systems: Networks of plausible inference , 1991 .

[31]  Andrew J. Davison,et al.  Active Matching , 2008, ECCV.

[32]  Pedro M. Domingos,et al.  Markov Logic: An Interface Layer for Artificial Intelligence , 2009, Markov Logic: An Interface Layer for Artificial Intelligence.

[33]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.