PPDL: Probabilistic Programming with Datalog

There has been a substantial recent focus on the concept of probabilistic programming [6] towards its positioning as a prominent paradigm for advancing and facilitating the development of machine-learning applications. A probabilisticprogramming language typically consists of two components: a specification of a stochastic process (the prior), and a specification of observations that restrict the probability space to a conditional subspace (the posterior). This paper gives a brief overview of Probabilistic Programming DataLog (PPDL), a recently proposed declarative framework for specifying statistical models on top of a database, through an appropriate extension of Datalog [1]. By virtue of extending Datalog, PPDL offers a natural integration with the database, and has a robust declarative semantics, that is, semantic independence from the algorithmic evaluation of rules, and semantic invariance under logical program transformations. It provides convenient mechanisms to allow common numerical probability functions as first-class citizens in the language; in particular, conclusions of rules may contain values drawn from such functions.

[1]  Christopher Ré,et al.  Tuffy: Scaling up Statistical Inference in Markov Logic Networks using an RDBMS , 2011, Proc. VLDB Endow..

[2]  Daniel Deutch,et al.  On probabilistic fixpoint and Markov chain query languages , 2010, PODS '10.

[3]  Sebastian Rudolph,et al.  Extending Decidable Existential Rules by Joining Acyclicity and Guardedness , 2011, IJCAI.

[4]  Pedro M. Domingos,et al.  Markov Logic: An Interface Layer for Artificial Intelligence , 2009, Markov Logic: An Interface Layer for Artificial Intelligence.

[5]  Ronald Fagin,et al.  Data exchange: semantics and query answering , 2003, Theor. Comput. Sci..

[6]  Dan Suciu,et al.  Probabilistic databases , 2011, SIGA.

[7]  Andrea Calì,et al.  Datalog+/-: A Family of Logical Knowledge Representation and Query Languages for New Applications , 2010, 2010 25th Annual IEEE Symposium on Logic in Computer Science.

[8]  Balder ten Cate,et al.  Declarative Statistical Modeling with Datalog , 2014, ArXiv.

[9]  Noah D. Goodman The principles and practice of probabilistic programming , 2013, POPL.

[10]  Georg Gottlob,et al.  Query answering under probabilistic uncertainty in Datalog+ / − ontologies , 2013, Annals of Mathematics and Artificial Intelligence.

[11]  Pierre Senellart,et al.  Probabilistic XML: Models and Complexity , 2013, Advances in Probabilistic Databases for Uncertain Information Management.