Hierarchical Variance Models of Image Sequences

Sivumäärä: 82 Abstract of Master's thesis Abstract: Unsupervised learning applied to image sequences ordinarily yields simple features such as edge detectors. These simple features can't provide, as such, very high level information about an image sequence. It is by combining the information they provide that something more meaningful can be extracted from the data. The values statistical models usually predict are the means of the underlying probability distributions. All the higher order statistics are ignored. The variance describes the deviation of a probability distribution from the mean value. The estimation of variances jointly with means is difficult and consequently not much emphasis is put on it. However, it is very well known that in several datasets the variance conveys a lot of information that is not extractable if one only models means. The basic question of the thesis is, whether the modelling of variances in image sequences is useful and if something can be gained by doing it. It is shown that this indeed is the case and a specific hierarchical model utilizing variances is constructed step by step. The learning algorithm, including the local update rules and global initialization schemes, is also derived and presented. The basic approach taken is variational Bayesian learning which has proven to be a robust method even in rather hard problems. The model is put to test by performing simulations with artificial data which shows that the learning algorithm works. Simulations with an image sequence from a natural scene show that the algorithm works also with realistic data. It has been funded by the Finnish Center of Excellence Programme under the project New Information Processing Principles. I would like to express my gratitude to my instructor Tapani Raiko and to my supervisor Professor Juha Karhunen. Also Antti Honkela's technical advice have been most helpful. Additionally, I would like to thank Doctors Jarmo Hurri and Hans van Hateren — the former for providing advice and code for accessing the video database of the latter [56]. Finally I wish to acknowledge Dr. Harri Valpola, whose pioneering work on applying variational Bayesian learning to non-linear and non-Gaussian models has been the essential basis of this thesis.

[1]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[2]  D. Ruderman,et al.  Independent component analysis of natural image sequences yields spatio-temporal filters similar to simple cells in primary visual cortex , 1998, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[3]  R. Feynman Statistical Mechanics, A Set of Lectures , 1972 .

[4]  J. Karhunen,et al.  Building Blocks for Hierarchical Latent Variable Models , 2001 .

[5]  Aapo Hyvärinen,et al.  Simple-Cell-Like Receptive Fields Maximize Temporal Coherence in Natural Video , 2003, Neural Computation.

[6]  H. Attias Independent Component Analysis: ICA, graphical models and variational methods , 2001 .

[7]  L. Goddard Information Theory , 1962, Nature.

[8]  Antti Honkela,et al.  Bayes Blocks Software Library , 2003 .

[9]  Richard M. Everson,et al.  ICA: model order selection and dynamic source models , 2001 .

[10]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[11]  Hagai Attias,et al.  Independent Factor Analysis , 1999, Neural Computation.

[12]  A. Honkela Speeding up cyclic update schemes by pattern searches , 2002, Proceedings of the 9th International Conference on Neural Information Processing, 2002. ICONIP '02..

[13]  Aapo Hyvärinen,et al.  Bubbles: a unifying framework for low-level statistical properties of natural image sequences. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.

[14]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[15]  Antti Honkela,et al.  Variational learning and bits-back coding: an information-theoretic view to Bayesian learning , 2004, IEEE Transactions on Neural Networks.

[16]  Alexander Ilin,et al.  On the Effect of the Form of the Posterior Approximation in Variational Learning of ICA Models , 2005, Neural Processing Letters.

[17]  Terrence J. Sejnowski,et al.  Slow Feature Analysis: Unsupervised Learning of Invariances , 2002, Neural Computation.

[18]  R. T. Cox Probability, frequency and reasonable expectation , 1990 .

[19]  Mark D. Plumbley,et al.  BLIND SEPARATION OF POSITIVE SOURCES USING NON-NEGATIVE PC A , 2003 .

[20]  Aapo Hyvärinen,et al.  Topographic Independent Component Analysis , 2001, Neural Computation.

[21]  T. Bollerslev,et al.  Generalized autoregressive conditional heteroskedasticity , 1986 .

[22]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[23]  Geoffrey E. Hinton,et al.  Variational Learning for Switching State-Space Models , 2000, Neural Computation.

[24]  Laurenz Wiskott,et al.  Applying Slow Feature Analysis to Image Sequences Yields a Rich Repertoire of Complex Cell Properties , 2002, ICANN.

[25]  T. Raiko,et al.  Partially observed values , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[26]  Harri Lappalainen,et al.  Ensemble learning for independent component analysis , 1999 .

[27]  Antti Honkela,et al.  On-line Variational Bayesian Learning , 2003 .

[28]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[29]  Terrence J. Sejnowski,et al.  The “independent components” of natural scenes are edge filters , 1997, Vision Research.

[30]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[31]  Geoffrey E. Hinton,et al.  Keeping the neural networks simple by minimizing the description length of the weights , 1993, COLT '93.

[32]  Juha Karhunen,et al.  Accelerating Cyclic Update Algorithms for Parameter Estimation by Pattern Searches , 2003, Neural Processing Letters.

[33]  D. Hubel,et al.  Receptive fields, binocular interaction and functional architecture in the cat's visual cortex , 1962, The Journal of physiology.

[34]  Juha Karhunen,et al.  An Unsupervised Ensemble Learning Method for Nonlinear Dynamic State-Space Models , 2002, Neural Computation.

[35]  N. Shephard,et al.  Stochastic Volatility: Likelihood Inference And Comparison With Arch Models , 1996 .

[36]  Antti Honkela,et al.  Bayesian Non-Linear Independent Component Analysis by Multi-Layer Perceptrons , 2000 .

[37]  R. T. Cox,et al.  The Algebra of Probable Inference , 1962 .

[38]  P. Lennie Receptive fields , 2003, Current Biology.

[39]  Aapo Hyvärinen,et al.  Emergence of Phase- and Shift-Invariant Features by Decomposition of Natural Images into Independent Feature Subspaces , 2000, Neural Computation.

[40]  J. W. Miskin,et al.  Ensemble Learning for Blind Source Separation , 2001 .

[41]  David J. C. MacKay,et al.  Developments in Probabilistic Modelling with Neural Networks - Ensemble Learning , 1995, SNN Symposium on Neural Networks.

[42]  Lucas C. Parra,et al.  Higher-Order Statistical Properties Arising from the Non-Stationarity of Natural Signals , 2000, NIPS.

[43]  J. Karhunen,et al.  Nonlinear Independent Factor Analysis by Hierarchical Models , 2003 .

[44]  R. Engle Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation , 1982 .

[45]  R. Baierlein Probability Theory: The Logic of Science , 2004 .

[46]  Juha Karhunen,et al.  Hierarchical models of variance sources , 2004, Signal Process..

[47]  D. Luenberger Optimization by Vector Space Methods , 1968 .

[48]  Erkki Oja,et al.  Independent Component Analysis , 2001 .

[49]  David Barber,et al.  Ensemble Learning for Multi-Layer Networks , 1997, NIPS.

[50]  Dennis Gabor,et al.  Theory of communication , 1946 .

[51]  Juha Karhunen,et al.  Missing Values in Hierarchical Nonlinear Factor Analysis , 2003 .

[52]  D. Mackay Local Minima, Symmetry-breaking, and Model Pruning in Variational Free Energy Minimization , 2001 .