Application of different entropy formalisms in a neural network for novel word learning

Abstract.In this paper novel word learning in adults is studied. For this goal, four entropy formalisms are employed to include some degree of non-locality in a neural network. The entropy formalisms are Tsallis, Landsberg-Vedral, Kaniadakis, and Abe entropies. First, we have analytically obtained non-extensive cost functions for the all entropies. Then, we have used a generalization of the gradient descent dynamics as a learning rule in a simple perceptron. The Langevin equations are numerically solved and the error function (learning curve) is obtained versus time for different values of the parameters. The influence of index q and number of neuron N on learning is investigated for the all entropies. It is found that learning is a decreasing function of time for the all entropies. The rate of learning for the Landsberg-Vedral entropy is slower than other entropies. The variation of learning with time for the Landsberg-Vedral entropy is not appreciable when the number of neurons increases. It is said that entropy formalism can be used as a means for studying the learning.

[1]  Prahlad Gupta,et al.  Examining the Relationship between word Learning, Nonword Repetition, and Immediate Serial Recall in Adults , 2003, The Quarterly journal of experimental psychology. A, Human experimental psychology.

[2]  N. Thakor,et al.  Nonextensive entropy measure of EEG following brain injury from cardiac arrest , 2002 .

[3]  J J Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[4]  C. Tsallis Possible generalization of Boltzmann-Gibbs statistics , 1988 .

[5]  Sumiyoshi Abe,et al.  Superstatistics, thermodynamics, and fluctuations. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[6]  H. Risken Fokker-Planck Equation , 1984 .

[7]  Christian Beck,et al.  Generalised information and entropy measures in physics , 2009, 0902.1235.

[8]  Pierre-Henri Chavanis,et al.  Nonlinear mean field Fokker-Planck equations. Application to the chemotaxis of biological populations , 2007, 0709.1829.

[9]  D. Stariolo,et al.  Learning dynamics of simple perceptrons with non-extensive cost functions. , 1996, Network.

[10]  Sergio A. Cannas,et al.  A Tsallis’ statistics based neural network model for novel word learning , 2009 .

[11]  Berndt Müller,et al.  Neural networks: an introduction , 1990 .

[12]  J. G. Taylor,et al.  Neural Networks and the Brain , 1974 .

[13]  P. Landsberg,et al.  Distributions and channel capacities in generalized statistical mechanics , 1998 .

[14]  A. Bezerianos,et al.  Monitoring brain injury with Tsallis entropy , 2001, 2001 Conference Proceedings of the 23rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[15]  Gibbs–Jaynes Entropy Versus Relative Entropy , 2014, 1402.2205.

[16]  C. Tsallis,et al.  A nonextensive approach to the dynamics of financial observables , 2006, physics/0601222.

[17]  H. Horner Dynamics of learning for the binary perceptron problem , 1992 .

[18]  Haim Sompolinsky,et al.  STATISTICAL MECHANICS OF NEURAL NETWORKS , 1988 .

[19]  Sompolinsky,et al.  Storing infinite numbers of patterns in a spin-glass model of neural networks. , 1985, Physical review letters.

[20]  P. Peretto An introduction to the modeling of neural networks , 1992 .

[21]  Sompolinsky,et al.  Statistical mechanics of learning from examples. , 1992, Physical review. A, Atomic, molecular, and optical physics.

[22]  T. Watkin,et al.  THE STATISTICAL-MECHANICS OF LEARNING A RULE , 1993 .

[23]  Chih-Jen Lin,et al.  Dual coordinate descent methods for logistic regression and maximum entropy models , 2011, Machine Learning.

[24]  Kazuho Watanabe,et al.  Entropic risk minimization for nonparametric estimation of mixing distributions , 2014, Machine Learning.

[25]  C. Tsallis,et al.  The role of constraints within generalized nonextensive statistics , 1998 .

[26]  Christian Beck Generalized statistical mechanics and fully developed turbulence , 2002 .

[27]  James L. McClelland,et al.  Sentence comprehension: A parallel distributed processing approach , 1989, Language and Cognitive Processes.

[28]  John W. Clark,et al.  Statistical mechanics of neural networks , 1988 .

[29]  O. Dudko Statistical Mechanics: Entropy, Order Parameters, and Complexity , 2007 .

[30]  A. D. Anastasiadis,et al.  A nonextensive method for spectroscopic data analysis with artificial neural networks , 2009 .

[31]  J. Naudts Generalized thermostatistics based on deformed exponential and logarithmic functions , 2003, cond-mat/0311438.

[32]  D. Stariolo The Langevin and Fokker-Planck equations in the framework of a generalized statistical mechanics , 1994 .

[33]  M. Masi A step beyond Tsallis and Rényi entropies , 2005, cond-mat/0505107.