Mutual information for fitting deep nonlinear models

Deep nonlinear models pose a challenge for fitting parameters due to lack of knowledge of the hidden layer and the potentially non-affine relation of the initial and observed layers. In the present work we investigate the use of information theoretic measures such as mutual information and Kullback-Leibler (KL) divergence as objective functions for fitting such models without knowledge of the hidden layer. We investigate one model as a proof of concept and one application of cogntive performance. We further investigate the use of optimizers with these methods. Mutual information is largely successful as an objective, depending on the parameters. KL divergence is found to be similarly succesful, given some knowledge of the statistics of the hidden layer.

[1]  I-Jeng Wang,et al.  Stochastic optimisation with inequality constraints using simultaneous perturbations and penalty functions , 2008, Int. J. Control.

[2]  Jürgen Kurths,et al.  Nonlinear Dynamical System Identification from Uncertain and Indirect Measurements , 2004, Int. J. Bifurc. Chaos.

[3]  Feng Xu,et al.  A new kinetic model for heterogeneous (or spatially confined) enzymatic catalysis : Contributions from the fractal and jamming (overcrowding) effects , 2007 .

[4]  Aram Galstyan,et al.  Efficient Estimation of Mutual Information for Strongly Dependent Variables , 2014, AISTATS.

[5]  Anuj K. Shah,et al.  Some Consequences of Having Too Little , 2012, Science.

[6]  Maarten A. S. Boksem,et al.  Mental fatigue: Costs and benefits , 2008, Brain Research Reviews.

[7]  O. Nelles Nonlinear System Identification: From Classical Approaches to Neural Networks and Fuzzy Models , 2000 .

[8]  Devavrat Shah,et al.  On entropy for mixtures of discrete and continuous variables , 2006, ArXiv.

[9]  J. Spall Implementation of the simultaneous perturbation algorithm for stochastic optimization , 1998 .

[10]  WangQing,et al.  Divergence estimation for multidimensional densities via k-nearest-neighbor distances , 2009 .

[11]  Maarten A. S. Boksem,et al.  Effects of mental fatigue on attention: an ERP study. , 2005, Brain research. Cognitive brain research.

[12]  Pierre J Magistretti,et al.  In Vivo Evidence for Lactate as a Neuronal Energy Source , 2011, The Journal of Neuroscience.

[13]  A. Kraskov,et al.  Estimating mutual information. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[14]  Václav Peterka,et al.  Bayesian system identification , 1979, Autom..

[15]  S. Saigal,et al.  Relative performance of mutual information estimation methods for quantifying the dependence among short and noisy data. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[16]  John P. Lowry,et al.  An integrative dynamic model of brain energy metabolism using in vivo neurochemical measurements , 2009, Journal of Computational Neuroscience.