Mixture Models and Representational Power of RBM ’ s , DBN ’ s and DBM ’ s

Here we give a contribution intended to help working out the m ini al size of Restricted Boltzmann Machines (RBM’s), Deep Belief Networks (DBN’s) or Deep Boltzmann Machines (DBM’s) which are universal approximat ors of visible distributions. The representational power of these objects ar ises from the marginalization of hidden units, an operation which naturally produ ces mixtures of conditional distributions. We present results on the representa tio l power of mixture models with factorizing mixture components, in particular a sharp bound on the required and sufficient number of mixture components to repr esent any arbitrary visible distribution. The methods disclose a class of visib le distributions which require the maximal number of mixture components while all m ixture components must be atoms. We derive a test of universal approximat ing properties and find that an RBM with more than2 − 1 parameters is not always a universal approximator of distributions on {0, 1}n.