Formal Languages, Deep Learning, Topology and AlgebraicWord Problems

This paper describes relationships between various state-of-the-art neural network architectures and formal languages as, for example, structured by the Chomsky Language Hierarchy. Of particular interest are the abilities of a neural architecture to represent, recognize and generate words from a specific language by learning from positive and negative samples of words in the language. Of specific interest are some relationships between languages, networks and topology that we outline analytically and explore through several illustrative experiments. By specifically comparing analytic results relating formal languages to topology via algebraic word problems with empirical results based on neural networks and persistent homology calculations, we see evidence that certain observed topological properties match analytically predicted properties. Such results are encouraging for understanding the role that modern machine learning can play in formal language processing problems.

[1]  Navin Goyal,et al.  On the Ability and Limitations of Transformers to Recognize Formal Languages , 2020, EMNLP.

[2]  George Cybenko,et al.  A Survey of Neural Networks and Formal Languages , 2020, ArXiv.

[3]  Lek-Heng Lim,et al.  Topology of deep neural networks , 2020, J. Mach. Learn. Res..

[4]  Yonatan Belinkov,et al.  Memory-Augmented Recurrent Neural Networks Can Learn Generalized Dyck Languages , 2019, ArXiv.

[5]  William Merrill,et al.  Sequential Neural Networks as Automata , 2019, Proceedings of the Workshop on Deep Learning and Formal Languages: Building Bridges.

[6]  Eran Yahav,et al.  On the Practical Computational Power of Finite Precision RNNs for Language Recognition , 2018, ACL.

[7]  Richard M. Thomas,et al.  Word problems of groups: Formal languages, characterizations and decidability , 2018, Theor. Comput. Sci..

[8]  Ruslan Salakhutdinov,et al.  On Characterizing the Capacity of Neural Networks using Algebraic Topology , 2018, ArXiv.

[9]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[10]  Phil Blunsom,et al.  Learning to Transduce with Unbounded Memory , 2015, NIPS.

[11]  Alex Graves,et al.  Neural Turing Machines , 2014, ArXiv.

[12]  Sergey Bratus,et al.  Security Applications of Formal Language Theory , 2013, IEEE Systems Journal.

[13]  James Rogers,et al.  Formal language theory: refining the Chomsky hierarchy , 2012, Philosophical Transactions of the Royal Society B: Biological Sciences.

[14]  Derek F. Holt,et al.  Groups that do and do not Have Growing Context-Sensitive Word Problem , 2008, Int. J. Algebra Comput..

[15]  Arnold L. Rosenberg,et al.  Counter machines and counter languages , 1968, Mathematical systems theory.

[16]  Richard M. Thomas,et al.  Context-Sensitive Decision Problems in Groups , 2004, Developments in Language Theory.

[17]  Afra Zomorodian,et al.  Computing Persistent Homology , 2004, SCG '04.

[18]  Rajesh Parekh,et al.  Learning DFA from Simple Examples , 1997, Machine Learning.

[19]  Janet Wiles,et al.  Learning a context-free task with a recurrent neural network: An analysis of stability , 1999 .

[20]  W. Goldman Topology and Geometry. By Glen E. Bredon , 1998 .

[21]  Janet Wiles,et al.  Recurrent Neural Networks Can Learn to Implement Symbol-Sensitive Counting , 1997, NIPS.

[22]  Helko Lehmann,et al.  Designing a Counter: Another Case Study of Dynamics and Activation Landscapes in Recurrent Networks , 1997, KI.

[23]  Adam Grabowski,et al.  Introduction to the Homotopy Theory , 1997 .

[24]  Mark Steijvers,et al.  A Recurrent Network that performs a Context-Sensitive Prediction Task , 1996 .

[25]  R. Gilman Formal languages and infinite groups , 1995, Geometric and Computational Perspectives on Infinite Groups.

[26]  Michael Shapiro A Note on Context-Sensitive Languages and Word Problems , 1993, Int. J. Algebra Comput..

[27]  Hava T. Siegelmann,et al.  Analog computation via neural networks , 1993, [1993] The 2nd Israel Symposium on Theory and Computing Systems.

[28]  Charles F. Miller Decision Problems for Groups — Survey and Reflections , 1992 .

[29]  David B. A. Epstein,et al.  Word processing in groups , 1992 .

[30]  Colin Giles,et al.  Learning Context-free Grammars: Capabilities and Limitations of a Recurrent Neural Network with an External Stack Memory (cid:3) , 1992 .

[31]  J. Berstel,et al.  Context-free languages , 1993, SIGA.

[32]  M. J. Dunwoody The accessibility of finitely presented groups , 1985 .

[33]  David E. Muller,et al.  The Theory of Ends, Pushdown Automata, and Second-Order Logic , 1985, Theor. Comput. Sci..

[34]  David Haussler,et al.  Insertion languages , 1983, Inf. Sci..

[35]  Donald E. Knuth,et al.  An empirical study of FORTRAN programs , 1971, Softw. Pract. Exp..