Variational Methods

Variational methods are an important technique for the approximation of complicated probability distributions. They have applications in statistical physics, data modelling and neural networks. 10.1 Variational free energy minimization One well known method for approximating a complex distribution in a physical system is`mean eld theory'. Mean eld theory is in fact a special case of a general variational free energy approach of Feynman and Bogoliubov which we will now study. The key piece of mathematics needed to understand this method is Gibbs' inequality (equation (1.24), exercise 20), which we repeat here. The relative entropy or Kullback-Leibler divergence between two probability distributions Q(x) and P(x) that are deened over the same alphabet A X is D KL (QjjP) = X x Q(x) log Q(x) P(x) : (10.1) The relative entropy satisses D KL (QjjP) 0 (Gibbs' inequality) with equality only if Q=P. Note that in general D KL (QjjP) 6 = D KL (PjjQ). 10.1.1 Probability distributions in statistical physics In statistical physics one often encounters probability distributions of the form