Unsupervised Classification Learning from Cross-Modal Environmental Structure

This dissertation addresses the problem of unsupervised learning for pattern classification or category learning. A model that is based on gross cortical anatomy and implements biologically plausible computations is developed and shown to have classification power approaching that of a supervised discriminant algorithm. .pp The advantage of supervised learning is that the final error metric is available during training. Unfortunately, when modeling human category learning, or in constructing classifiers for autonomous robots, one must deal with not having an omniscient entity labeling all incoming sensory patterns. We show that we can substitute for the labels by making use of structure between the pattern distributions to different sensory modalities. For example the co-occurrence of a visual image of a cow with a ``moo'''' sound can be used to simultaneously develop appropriate visual features for distinguishing the cow image and appropriate auditory features for recognizing the moo. .pp We model human category learning as a process of minimizing the disagreement between outputs of sensory modalities processing temporally coincident patterns. We relate this mathematically to the optimal goal of minimizing the number of misclassifications in each modality and apply the idea to derive an algorithm for piecewise linear classifiers in which each network uses the output of the other networks as a supervisory signal. .pp Using the Peterson-Barney vowel dataset we show that the algorithm finds appropriate placement for the classification boundaries. The algorithm is then demonstrated on the task of learning to recognize acoustic and visual speech from images of lips and their emanating sounds Performance on these tasks is within 1-7\% of the related supervised algorithm (LVQ2.1). .pp Finally we compare the algorithm to Becker''s IMAX algorithm and give suggestions as to how the algorithm may be implemented in the brain using physiological results concerning the relationship between two types of neural plasticity, LTP and LTD, observed in visual cortical cells. We also show how the algorithm can be used as an efficient method for dealing with learning from data with missing values.

[1]  H. McGurk,et al.  Hearing lips and seeing voices , 1976, Nature.

[2]  R Linsker,et al.  From basic network principles to neural architecture: emergence of orientation-selective cells. , 1986, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Richard A. Andersen,et al.  A back-propagation programmed network that simulates response properties of a subset of posterior parietal neurons , 1988, Nature.

[4]  Dana H. Ballard,et al.  A Note on Learning Vector Quantization , 1992, NIPS.

[5]  S. Hanson,et al.  Some Solutions to the Missing Feature Problem in Vision , 1993 .

[6]  G. A. Miller,et al.  An Analysis of Perceptual Confusions Among Some English Consonants , 1955 .

[7]  Richard Granger,et al.  A cortical model of winner-take-all competition via lateral inhibition , 1992, Neural Networks.

[8]  Dana H. Ballard,et al.  Top-Down Teaching Enables Non-Trivial Clustering via Competitive Learning , 1991 .

[9]  Anders Krogh,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[10]  Steven J. Nowlan,et al.  Soft competitive adaptation: neural network learning algorithms based on fitting statistical mixtures , 1991 .

[11]  K. Schulten,et al.  Kohonen's self-organizing maps: exploring their computational capabilities , 1988, IEEE 1988 International Conference on Neural Networks.

[12]  D. N. Spinelli,et al.  Receptive field organization of ganglion cells in the cat's retina. , 1967, Experimental neurology.

[13]  Suzanna Becker,et al.  Learning to Categorize Objects Using Temporal Coherence , 1992, NIPS.

[14]  Klaus Schulten,et al.  A Comparison between a Neural Network Model for the Formation of Brain Maps and Experimental Data , 1991, NIPS.

[15]  K Murata,et al.  Neuronal convergence of noxious, acoustic, and visual stimuli in the visual cortex of the cat. , 1965, Journal of neurophysiology.

[16]  John H. R. Maunsell,et al.  How parallel are the primate visual pathways? , 1993, Annual review of neuroscience.

[17]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[18]  R Linsker,et al.  From basic network principles to neural architecture: emergence of spatial-opponent cells. , 1986, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[20]  Jack Sklansky,et al.  Training a One-Dimensional Classifier to Minimize the Probability of Error , 1972, IEEE Trans. Syst. Man Cybern..

[21]  D. N. Spinelli,et al.  Afferent and efferent activity in single units of the cat's optic nerve. , 1966, Experimental neurology.

[22]  R Linsker,et al.  From basic network principles to neural architecture: emergence of orientation columns. , 1986, Proceedings of the National Academy of Sciences of the United States of America.

[23]  V. de Sa,et al.  Top-down teaching enables task-relevant classification with competitive learning , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[24]  E. Capaldi,et al.  The organization of behavior. , 1992, Journal of applied behavior analysis.

[25]  G. Stent A physiological mechanism for Hebb's postulate of learning. , 1973, Proceedings of the National Academy of Sciences of the United States of America.

[26]  FRANK MORRELL,et al.  Visual System's View of Acoustic Space , 1972, Nature.

[27]  R. Hari,et al.  Seeing speech: visual information from lip movements modifies activity in the human auditory cortex , 1991, Neuroscience Letters.

[28]  Mark Alan Fanty Learning in structured connectionist networks , 1988 .

[29]  M. Alexander,et al.  Principles of Neural Science , 1981 .

[30]  John S. Bridle,et al.  Training Stochastic Model Recognition Algorithms as Networks can Lead to Maximum Mutual Information Estimation of Parameters , 1989, NIPS.

[31]  T. Kohonen,et al.  Statistical pattern recognition with neural networks: benchmarking studies , 1988, IEEE 1988 International Conference on Neural Networks.

[32]  R. Palmer,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[33]  P. Werbos,et al.  Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .

[34]  James Ting-Ho Lo,et al.  Push-and-pull for piecewise linear machine training , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[35]  Jack Sklansky,et al.  Pattern Classifiers and Trainable Machines , 1981 .

[36]  T. Sejnowski,et al.  Storing covariance with nonlinearly interacting neurons , 1977, Journal of mathematical biology.

[37]  Risto Miikkulainen,et al.  Self-Organizing Process Based On Lateral Inhibition And Synaptic Resource Redistribution , 1991 .

[38]  Geoffrey E. Hinton,et al.  Self-organizing neural network that discovers surfaces in random-dot stereograms , 1992, Nature.

[39]  H. McGurk,et al.  Visual influences on speech perception processes , 1978, Perception & psychophysics.

[40]  K. Miller,et al.  Ocular dominance column development: analysis and simulation. , 1989, Science.

[41]  C. R. Michael,et al.  Integration of auditory information in the cat's visual cortex. , 1973, Vision research.

[42]  D. N. Spinelli,et al.  CENTRIFUGAL OPTIC NERVE RESPONSES EVOKED BY AUDITORY AND SOMATIC STIMULATION. , 1965, Experimental neurology.

[43]  A. Meltzoff,et al.  The Intermodal Representation of Speech in Infants , 1984 .

[44]  H. Ritter,et al.  A principle for the formation of the spatial structure of cortical feature maps. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[45]  W. H. Sumby,et al.  Visual contribution to speech intelligibility in noise , 1954 .

[46]  Helen Suzanna Becker,et al.  An information-theoretic unsupervised learning algorithm for neural networks , 1993 .