The induction of dynamical recognizers

A higher order recurrent neural network architecture learns to recognize and generate languages after being “trained” on categorized exemplars. Studying these networks from the perspective of dynamical systems yields two interesting discoveries: First, a longitudinal examination of the learning process illustrates a new form of mechanical inference: Induction by phase transition. A small weight adjustment causes a “bifurcation” in the limit behavior of the network. This phase transition corresponds to the onset of the network’s capacity for generalizing to arbitrary-length strings. Second, a study of the automata resulting from the acquisition of previously published training sets indicates that while the architecture is not guaranteed to find a minimal finite automaton consistent with the given exemplars, which is an NP-Hard problem, the architecture does appear capable of generating non-regular languages by exploiting fractal and chaotic dynamics. I end the paper with a hypothesis relating linguistic generative capacity to the behavioral regimes of non-linear dynamical systems.

[1]  George H. Mealy,et al.  A method for synthesizing sequential circuits , 1955 .

[2]  Noam Chomsky,et al.  Three models for the description of language , 1956, IRE Trans. Inf. Theory.

[3]  Noam Chomsky,et al.  वाक्यविन्यास का सैद्धान्तिक पक्ष = Aspects of the theory of syntax , 1965 .

[4]  Gregory J. Chaitin,et al.  On the Length of Programs for Computing Finite Binary Sequences , 1966, JACM.

[5]  E. Mark Gold,et al.  Language Identification in the Limit , 1967, Inf. Control..

[6]  Marvin Minsky,et al.  Computation : finite and infinite machines , 2016 .

[7]  Jerome A. Feldman,et al.  Some Decidability Results on Grammatical Inference and Complexity , 1972, Inf. Control..

[8]  Stephen A. Ritz,et al.  Distinctive features, categorical perception, and probability learning: some applications of a neural model , 1977 .

[9]  Benoit B. Mandelbrot,et al.  Fractal Geometry of Nature , 1984 .

[10]  E. Mark Gold,et al.  Complexity of Automaton Identification from Given Data , 1978, Inf. Control..

[11]  DANA ANGLUIN,et al.  On the Complexity of Minimum Inference of Regular Sets , 1978, Inf. Control..

[12]  Kenneth Wexler,et al.  Formal Principles of Language Acquisition , 1980 .

[13]  J J Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Stephen Wolfram,et al.  Universality and complexity in cellular automata , 1983 .

[15]  P. Grassberger,et al.  Measuring the Strangeness of Strange Attractors , 1983 .

[16]  Carl H. Smith,et al.  Inductive Inference: Theory and Methods , 1983, CSUR.

[17]  Philip Lieberman,et al.  The Biology and Evolution of Language , 1984 .

[18]  Steven Pinker,et al.  Language learnability and language development , 1985 .

[19]  J. Brickmann B. Mandelbrot: The Fractal Geometry of Nature, Freeman and Co., San Francisco 1982. 460 Seiten, Preis: £ 22,75. , 1985 .

[20]  Aravind K. Joshi,et al.  Natural language parsing: Tree adjoining grammars: How much context-sensitivity is required to provide reasonable structural descriptions? , 1985 .

[21]  Robert C. Berwick,et al.  The acquisition of syntactic knowledge , 1985 .

[22]  Paul Smolensky,et al.  Information processing in dynamical systems: foundations of harmony theory , 1986 .

[23]  R. Devaney An Introduction to Chaotic Dynamical Systems , 1990 .

[24]  James L. McClelland,et al.  PDP models and general issues in cognitive science , 1986 .

[25]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[26]  Tad Hogg,et al.  Phase Transitions in Artificial Intelligence Systems , 1987, Artif. Intell..

[27]  Robert E. Schapire,et al.  A new approach to unsupervised learning in deterministic environments , 1990 .

[28]  Richard P. Lippmann,et al.  An introduction to computing with neural nets , 1987 .

[29]  J. Yorke,et al.  Chaos, Strange Attractors, and Fractal Basin Boundaries in Nonlinear Dynamics , 1987, Science.

[30]  Pineda,et al.  Generalization of back-propagation to recurrent neural networks. , 1987, Physical review letters.

[31]  David S. Touretzky,et al.  A distributed connectionist representation for concept structures , 1987 .

[32]  James Gleick Chaos: Making a New Science , 1987 .

[33]  J. Metcalfe,et al.  Intuition in insight and noninsight problem solving , 1987, Memory & cognition.

[34]  W. Freeman,et al.  How brains make chaos in order to make sense of the world , 1987, Behavioral and Brain Sciences.

[35]  S. Pinker,et al.  On language and connectionism: Analysis of a parallel distributed processing model of language acquisition , 1988, Cognition.

[36]  Meir,et al.  Chaotic behavior of a layered neural network. , 1988, Physical review. A, General physics.

[37]  J. Fodor,et al.  Connectionism and cognitive architecture: A critical analysis , 1988, Cognition.

[38]  Michael F. Barnsley,et al.  Fractals everywhere , 1988 .

[39]  James L. McClelland,et al.  Learning Subsequential Structure in Simple Recurrent Networks , 1988, NIPS.

[40]  Jordan B. Pollack,et al.  Implications of Recursive Distributed Representations , 1988, NIPS.

[41]  C. Lee Giles,et al.  Higher Order Recurrent Networks and Grammatical Inference , 1989, NIPS.

[42]  Barak A. Pearlmutter Learning State Space Trajectories in Recurrent Neural Networks , 1989, Neural Computation.

[43]  C. Grebogi Chaos, Strange Attractors, and Fractal Basin Boundaries , 1989 .

[44]  Paul F. M. J. Verschure,et al.  A note on chaotic behavior in simple neural networks , 1990, Neural Networks.

[45]  W. Pitts,et al.  A Logical Calculus of the Ideas Immanent in Nervous Activity (1943) , 2021, Ideas That Created the Future.

[46]  David J. Weir,et al.  The convergence of mildly context-sensitive grammar formalisms , 1990 .

[47]  Kurt Hornik,et al.  Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks , 1990, Neural Networks.

[48]  Mats G. Nordahl,et al.  Universal Computation in Simple One-Dimensional Cellular Automata , 1990, Complex Syst..

[49]  John F. Kolen,et al.  Backpropagation is Sensitive to Initial Conditions , 1990, Complex Syst..

[50]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[51]  Jordan B. Pollack,et al.  Recursive Distributed Representations , 1990, Artif. Intell..

[52]  S. Pinker,et al.  Natural language and natural selection , 1990, Behavioral and Brain Sciences.

[53]  Moore,et al.  Unpredictability and undecidability in dynamical systems. , 1990, Physical review letters.

[54]  Marius Usher,et al.  Chaotic Behavior of A Neural Network with Dynamical Thresholds , 1991, Int. J. Neural Syst..

[55]  Zenon W. Pylyshyn,et al.  Connectionism and cognitive architecture , 1993 .