Measuring the Usefulness of Hidden Units in Boltzmann Machines with Mutual Information

Restricted Boltzmann machines (RBMs) and deep Boltzmann machines (DBMs) are important models in deep learning, but it is often difficult to measure their performance in general, or measure the importance of individual hidden units in specific. We propose to use mutual information to measure the usefulness of individual hidden units in Boltzmann machines. The measure serves as an upper bound for the information the neuron can pass on, enabling detection of a particular kind of poor training results. We confirm experimentally, that the proposed measure is telling how much the performance of the model drops when some of the units of an RBM are pruned away. Our experiments on DBMs highlight differences among different pretraining options.

[1]  Pieter Abbeel,et al.  Max-margin Classification of Data with Absent Features , 2008, J. Mach. Learn. Res..

[2]  R. Salakhutdinov Learning and Evaluating Boltzmann Machines , 2008 .

[3]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[4]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[5]  Geoffrey E. Hinton,et al.  Restricted Boltzmann machines for collaborative filtering , 2007, ICML '07.

[6]  Russell Reed,et al.  Pruning algorithms-a survey , 1993, IEEE Trans. Neural Networks.

[7]  Juha Karhunen,et al.  A Two-Stage Pretraining Algorithm for Deep Boltzmann Machines , 2013, ICANN.

[8]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[9]  Ryan P. Adams,et al.  Learning the Structure of Deep Sparse Graphical Models , 2009, AISTATS.

[10]  Paul Smolensky,et al.  Information processing in dynamical systems: foundations of harmony theory , 1986 .

[11]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[12]  Nitish Srivastava,et al.  Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..

[13]  Andries Petrus Engelbrecht,et al.  A new pruning heuristic based on variance analysis of sensitivity information , 2001, IEEE Trans. Neural Networks.

[14]  Honglak Lee,et al.  Online Incremental Feature Learning with Denoising Autoencoders , 2012, AISTATS.

[15]  Tapani Raiko,et al.  Enhanced Gradient for Training Restricted Boltzmann Machines , 2013, Neural Computation.

[16]  Samuel Kaski,et al.  Discriminative components of data , 2005, IEEE Transactions on Neural Networks.

[17]  Marc'Aurelio Ranzato,et al.  Sparse Feature Learning for Deep Belief Networks , 2007, NIPS.

[18]  Honglak Lee,et al.  Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[19]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.