Knowledge-guided Semantic Computing Network

It is very useful to integrate human knowledge and experience into traditional neural networks for faster learning speed, fewer training samples and better interpretability. However, due to the obscured and indescribable black box model of neural networks, it is very difficult to design its architecture, interpret its features and predict its performance. Inspired by human visual cognition process, we propose a knowledge-guided semantic computing network which includes two modules: a knowledge-guided semantic tree and a data-driven neural network. The semantic tree is pre-defined to describe the spatial structural relations of different semantics, which just corresponds to the tree-like description of objects based on human knowledge. The object recognition process through the semantic tree only needs simple forward computing without training. Besides, to enhance the recognition ability of the semantic tree in aspects of the diversity, randomicity and variability, we use the traditional neural network to aid the semantic tree to learn some indescribable features. Only in this case, the training process is needed. The experimental results on MNIST and GTSRB datasets show that compared with the traditional data-driven network, our proposed semantic computing network can achieve better performance with fewer training samples and lower computational complexity. Especially, Our model also has better adversarial robustness than traditional neural network with the help of human knowledge.

[1]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[3]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[5]  Steven W. Zucker,et al.  Feedforward Learning of Mixture Models , 2014, NIPS.

[6]  Ke Lu,et al.  Sparse-Representation-Based Graph Embedding for Traffic Sign Recognition , 2012, IEEE Transactions on Intelligent Transportation Systems.

[7]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[8]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[9]  Kouichi Sakurai,et al.  One Pixel Attack for Fooling Deep Neural Networks , 2017, IEEE Transactions on Evolutionary Computation.

[10]  Quanshi Zhang,et al.  Visual interpretability for deep learning: a survey , 2018, Frontiers of Information Technology & Electronic Engineering.

[11]  Arvind Satyanarayan,et al.  The Building Blocks of Interpretability , 2018 .

[12]  D. Hubel,et al.  Segregation of form, color, movement, and depth: anatomy, physiology, and perception. , 1988, Science.

[13]  Changshui Zhang,et al.  Traffic Sign Recognition With Hinge Loss Trained Convolutional Neural Networks , 2014, IEEE Transactions on Intelligent Transportation Systems.

[14]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Geoffrey E. Hinton,et al.  Dynamic Routing Between Capsules , 2017, NIPS.

[16]  Nina Narodytska,et al.  Simple Black-Box Adversarial Attacks on Deep Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[17]  J. Tenenbaum,et al.  Word learning as Bayesian inference. , 2007, Psychological review.

[18]  Nicolas Pinto,et al.  Why is Real-World Visual Object Recognition Hard? , 2008, PLoS Comput. Biol..

[19]  D. Hubel,et al.  Receptive fields, binocular interaction and functional architecture in the cat's visual cortex , 1962, The Journal of physiology.

[20]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[21]  Ananthram Swami,et al.  Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[22]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[23]  Pascal Frossard,et al.  Analysis of classifiers’ robustness to adversarial perturbations , 2015, Machine Learning.

[24]  Tao Xiang,et al.  Learning to Compare: Relation Network for Few-Shot Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[26]  Geoffrey E. Hinton,et al.  Matrix capsules with EM routing , 2018, ICLR.

[27]  Jason Jianjun Gu,et al.  An Efficient Method for Traffic Sign Recognition Based on Extreme Learning Machine , 2017, IEEE Transactions on Cybernetics.

[28]  Bhaskara Marthi,et al.  A generative vision model that trains with high data efficiency and breaks text-based CAPTCHAs , 2017, Science.

[29]  Pietro Perona,et al.  One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Samy Bengio,et al.  Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Leslie G. Ungerleider,et al.  ‘What’ and ‘where’ in the human brain , 1994, Current Opinion in Neurobiology.

[32]  Blaine Nelson,et al.  The security of machine learning , 2010, Machine Learning.

[33]  Song-Chun Zhu,et al.  Learning AND-OR Templates for Object Recognition and Detection , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  T. Poggio,et al.  Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.

[35]  Forrest N. Iandola,et al.  SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.

[36]  Quanshi Zhang,et al.  Interpretable Convolutional Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[37]  Geoffrey E. Hinton,et al.  Experiments on Learning by Back Propagation. , 1986 .

[38]  J. Franklin,et al.  The elements of statistical learning: data mining, inference and prediction , 2005 .

[39]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[40]  Xiaoli Hao,et al.  Robust Traffic Sign Recognition Based on Color Global and Local Oriented Edge Magnitude Patterns , 2014, IEEE Transactions on Intelligent Transportation Systems.

[41]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[42]  John D. Austin,et al.  Adaptive histogram equalization and its variations , 1987 .

[43]  Lei Wang,et al.  HEp-2 Cell Image Classification With Deep Convolutional Neural Networks , 2015, IEEE Journal of Biomedical and Health Informatics.

[44]  Eve V. Clark,et al.  The principle of contrast: A constraint on language acquisition. , 1987 .

[45]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[46]  Leslie G. Ungerleider,et al.  Distributed representation of objects in the human ventral visual pathway. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[47]  Blaine Nelson,et al.  Can machine learning be secure? , 2006, ASIACCS '06.