Fusing bottom-up and top-down pathways in neural networks for visual object recognition

In this paper, an artificial neural network model is built up with two pathways: bottom-up sensory-driven pathway and top-down expectation-driven pathway, which are fused to train the neural network for visual object recognition. During the supervised learning process, the bottom-up pathway generates hypotheses as network outputs. Then target label will be applied to update the bottom-up connections. On the other hand, the hypotheses generated by the bottom-up pathway will produce expectations on the sensory input through the top-down pathway. The expectations will be constrained by the real data from the sensory input which can be used to update the top-down connections accordingly. This two-pathway based neural network can also be applied to semi-supervised learning with both labeled and unlabeled data, where the network is able to generate hypotheses and corresponding expectations. Experiments on visual object recognition suggest that the proposed neural network model is promising to recover the object for the cases with missing data in sensory inputs.

[1]  J. Feldman,et al.  Connectionist models and their implications: readings from cognitive science , 1988 .

[2]  J. Duncan,et al.  Competitive brain activity in visual attention , 1997, Current Opinion in Neurobiology.

[3]  Heiko Wersing,et al.  A biologically motivated visual memory architecture for online learning of objects , 2008, Neural Networks.

[4]  Stephen Grossberg,et al.  Nonlinear neural networks: Principles, mechanisms, and architectures , 1988, Neural Networks.

[5]  Diane M. Beck,et al.  Top-down and bottom-up mechanisms in biasing competition in the human brain , 2009, Vision Research.

[6]  B Kosko,et al.  Adaptive bidirectional associative memories. , 1987, Applied optics.

[7]  N. Kanwisher,et al.  Visual attention: Insights from brain imaging , 2000, Nature Reviews Neuroscience.

[8]  Joachim M. Buhmann,et al.  Learning the Compositional Nature of Visual Objects , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  S. Treue Visual attention: the where, what, how and why of saliency , 2003, Current Opinion in Neurobiology.

[10]  Rajesh P. N. Rao,et al.  An optimal estimation approach to visual perception and learning , 1999, Vision Research.

[11]  PoggioTomaso,et al.  Robust Object Recognition with Cortex-Like Mechanisms , 2007 .

[12]  Stephen Grossberg,et al.  Competitive Learning: From Interactive Activation to Adaptive Resonance , 1987, Cogn. Sci..

[13]  Thomas Serre,et al.  Object recognition with features inspired by visual cortex , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[14]  Kunihiko Fukushima,et al.  Neural network model restoring partly occluded patterns , 2003, Int. J. Knowl. Based Intell. Eng. Syst..

[15]  W. Singer,et al.  Dynamic predictions: Oscillations and synchrony in top–down processing , 2001, Nature Reviews Neuroscience.

[16]  Thomas Serre,et al.  Robust Object Recognition with Cortex-Like Mechanisms , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  J J Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.