Brain-inspired Robust Vision using Convolutional Neural Networks with Feedback

Primates have a remarkable ability to correctly classify images even in the presence of significant noise and degradation. In contrast, even the state-of-art CNNs are extremely vulnerable to imperceptible level of noise. Many neuroscience studies have suggested that robustness in human vision arises from the interaction between the feedforward signals from bottom-up pathways of the visual cortex and the feedback signals from the top-down pathways. Motivated by this, we propose a new neuro-inspired model, namely Convolutional Neural Networks with Feedback (CNN-F). CNN-F augments CNN with a feedback generative network that shares the same set of weights along with an additional set of latent variables. CNN-F combines bottom-up and top-down inference through approximate loopy belief propagation to obtain the MAP-estimates of the latent variables. We show that CNN-F’s iterative inference allows for disentanglement of latent variables across layers. We validate the advantages of CNN-F over the baseline CNN in multiple ways. Our experimental results suggest that the CNN-F is more robust to image degradation such as pixel noise, occlusion, and blur than the corresponding CNN. Furthermore, we show that the CNN-F is capable of restoring original images from the degraded ones with high reconstruction accuracy while introducing negligible artifacts.

[1]  A. Borst Seeing smells: imaging olfactory learning in bees , 1999, Nature Neuroscience.

[2]  Tai Sing Lee,et al.  The Visual System's Internal Model of the World , 2015, Proceedings of the IEEE.

[3]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[4]  Jiajun Wu,et al.  Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling , 2016, NIPS.

[5]  Nikolaus Kriegeskorte,et al.  Recurrent Convolutional Neural Networks: A Better Model of Biological Object Recognition , 2017, bioRxiv.

[6]  Lina J. Karam,et al.  A Study and Comparison of Human and Deep Learning Recognition Performance under Visual Distortions , 2017, 2017 26th International Conference on Computer Communication and Networks (ICCCN).

[7]  David Cox,et al.  Recurrent computations for visual pattern completion , 2017, Proceedings of the National Academy of Sciences.

[8]  James J. DiCarlo,et al.  Evidence that recurrent circuits are critical to the ventral stream’s execution of core object recognition behavior , 2018, Nature Neuroscience.

[9]  Michael I. Jordan,et al.  Neural Rendering Model: Joint Generation and Prediction for Semi-Supervised Learning , 2018, ArXiv.

[10]  A. Abbott How the brain’s face code might unlock the mysteries of perception , 2018, Nature.

[11]  Yukiyasu Kamitani,et al.  Sharpening of Hierarchical Visual Feature Representations of Blurred Images , 2018, eNeuro.

[12]  Richard G. Baraniuk,et al.  Out-of-Distribution Detection Using Neural Rendering Generative Models , 2019, ArXiv.