Learning from Few Samples with Memory Network

Neural networks (NN) have achieved great successes in pattern recognition and machine learning. However, the success of a NN usually relies on the provision of a sufficiently large number of data samples as training data. When fed with a limited data set, a NN’s performance may be degraded significantly. In this paper, a novel NN structure is proposed called a memory network. It is inspired by the cognitive mechanism of human beings, which can learn effectively, even from limited data. Taking advantage of the memory from previous samples, the new model achieves a remarkable improvement in performance when trained using limited data. The memory network is demonstrated here using the multi-layer perceptron (MLP) as a base model. However, it would be straightforward to extend the idea to other neural networks, e.g., convolutional neural networks (CNN). In this paper, the memory network structure is detailed, the training algorithm is presented, and a series of experiments are conducted to validate the proposed framework. Experimental results show that the proposed model outperforms traditional MLP-based models as well as other competitive algorithms in response to two real benchmark data sets.

[1]  Joshua B. Tenenbaum,et al.  Separating Style and Content with Bilinear Models , 2000, Neural Computation.

[2]  Kaizhu Huang,et al.  A Unified Gradient Regularization Family for Adversarial Examples , 2015, 2015 IEEE International Conference on Data Mining.

[3]  Kaizhu Huang,et al.  Pattern Field Classification with Style Normalized Transformation , 2011, IJCAI.

[4]  Kaizhu Huang,et al.  Learning from Few Samples with Memory Network , 2016, ICONIP.

[5]  Kaizhu Huang,et al.  SDRNF: generating scalable and discriminative random nonlinear features from data , 2016 .

[6]  Bruce W. Suter,et al.  The multilayer perceptron as an approximation to a Bayes optimal discriminant function , 1990, IEEE Trans. Neural Networks.

[7]  Dit-Yan Yeung,et al.  Towards Bayesian Deep Learning: A Survey , 2016, ArXiv.

[8]  Patrick J. Grother,et al.  NIST Special Database 19 Handprinted Forms and Characters Database , 1995 .

[9]  Erfu Yang,et al.  Visual Attention Model Based Vehicle Target Detection in Synthetic Aperture Radar Images: A Novel Approach , 2015, Cognitive Computation.

[10]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[11]  Ah Chung Tsoi,et al.  Face recognition: a convolutional neural-network approach , 1997, IEEE Trans. Neural Networks.

[12]  Xiaogang Wang,et al.  DeepID3: Face Recognition with Very Deep Neural Networks , 2015, ArXiv.

[13]  Forrest N. Iandola,et al.  SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.

[14]  Ke Lu,et al.  $p$-Laplacian Regularized Sparse Coding for Human Activity Recognition , 2016, IEEE Transactions on Industrial Electronics.

[15]  Michael R. Lyu,et al.  Local Learning vs. Global Learning: An Introduction to Maxi-Min Margin Machine , 2005 .

[16]  Ian T. Jolliffe,et al.  Principal Component Analysis , 2002, International Encyclopedia of Statistical Science.

[17]  Kaizhu Huang,et al.  Maximum margin semi-supervised learning with irrelevant data , 2015, Neural Networks.

[18]  Kaizhu Huang,et al.  Machine Learning: Modeling Data Locally and Globally , 2008 .

[19]  Ivan Laptev,et al.  Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  George Nagy,et al.  Style consistent classification of isogenous patterns , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  J. Crowley,et al.  Estimating Face orientation from Robust Detection of Salient Facial Structures , 2004 .

[22]  Xuelong Li,et al.  Geometric Mean for Subspace Selection , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[24]  Erik Cambria,et al.  Sentic Computing: A Common-Sense-Based Framework for Concept-Level Sentiment Analysis , 2015 .

[25]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.