Artificial Neural Networks and Machine Learning – ICANN 2014

In this paper a novel recurrent neural network (RNN) model for gradient-based sequence learning is introduced. The presented dynamic cortex memory (DCM) is an extension of the well-known long short term memory (LSTM) model. The main innovation of the DCM is the enhancement of the inner interplay of the gates and the error carousel due to several new and trainable connections. These connections enable a direct signal transfer from the gates to one another. With this novel enhancement the networks are able to converge faster during training with back-propagation through time (BPTT) than LSTM under the same training conditions. Furthermore, DCMs yield better generalization results than LSTMs. This behaviour is shown for different supervised problem scenarios, including storing precise values, adding and learning a context-sensitive grammar.

[1]  Jun Tani,et al.  Model-based learning for mobile robot navigation from the dynamical systems perspective , 1996, IEEE Trans. Syst. Man Cybern. Part B.

[2]  Gary M. Weiss Mining with rarity: a unifying framework , 2004, SKDD.

[3]  Robert C. Holte,et al.  Concept Learning and the Problem of Small Disjuncts , 1989, IJCAI.

[4]  Hui Han,et al.  Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning , 2005, ICIC.

[5]  I. Tomek,et al.  Two Modifications of CNN , 1976 .

[6]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[7]  José Salvador Sánchez,et al.  On the effectiveness of preprocessing methods when dealing with different levels of class imbalance , 2012, Knowl. Based Syst..

[8]  Yue-Shi Lee,et al.  Investigating the Effect of Sampling Methods for Imbalanced Data Distributions , 2006, 2006 IEEE International Conference on Systems, Man and Cybernetics.

[9]  Charles Elkan,et al.  The Foundations of Cost-Sensitive Learning , 2001, IJCAI.

[10]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[11]  Horst Bischof,et al.  Semi-Supervised Random Forests , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[12]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[13]  Jun Tani,et al.  Building Recurrent Neural Networks to Implement Multiple Attractor Dynamics Using the Gradient Descent Method , 2009, Adv. Artif. Neural Syst..

[14]  Sang-Hoon Oh,et al.  Error back-propagation algorithm for classification of imbalanced data , 2011, Neurocomputing.

[15]  Masahiro Kimura,et al.  Learning dynamical systems by recurrent neural networks from orbits , 1998, Neural Networks.

[16]  Stan Matwin,et al.  Addressing the Curse of Imbalanced Training Sets: One-Sided Selection , 1997, ICML.

[17]  J. Pollack The Induction of Dynamical Recognizers , 1996, Machine Learning.

[18]  Lakhmi C. Jain,et al.  Emerging Paradigms in Machine Learning , 2012 .

[19]  Jun Tani,et al.  Emergence of Functional Hierarchy in a Multiple Timescale Neural Network Model: A Humanoid Robot Experiment , 2008, PLoS Comput. Biol..

[20]  Gustavo E. A. P. A. Batista,et al.  A study of the behavior of several methods for balancing machine learning training data , 2004, SKDD.

[21]  Jun Tani,et al.  A model for learning to segment temporal sequences, utilizing a mixture of RNN experts together with adaptive variance , 2007, Neural Networks.

[22]  Yue-Shi Lee,et al.  Under-Sampling Approaches for Improving Prediction of the Minority Class in an Imbalanced Dataset , 2006 .

[23]  Shigeki Sugano,et al.  Learning to Reproduce Fluctuating Time Series by Inferring Their Time-Dependent Stochastic Properties: Application in Robot Learning Via Tutoring , 2013, IEEE Transactions on Autonomous Mental Development.

[24]  Kihoon Yoon,et al.  An unsupervised learning approach to resolving the data imbalanced issue in supervised learning problems in functional genomics , 2005, Fifth International Conference on Hybrid Intelligent Systems (HIS'05).

[25]  Kurt Driessens,et al.  Using Weighted Nearest Neighbor to Benefit from Unlabeled Data , 2006, PAKDD.

[26]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[27]  Haibo He,et al.  ADASYN: Adaptive synthetic sampling approach for imbalanced learning , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[28]  Jun Tani,et al.  Dynamic and interactive generation of object handling behaviors by a small humanoid robot using a dynamic neural network model , 2006, Neural Networks.