Language Modeling with Reduced Densities

We present a framework for modeling words, phrases, and longer expressions in a natural language using reduced density operators. We show these operators capture something of the meaning of these expressions and, under the Loewner order on positive semidefinite operators, preserve both a simple form of entailment and the relevant statistics therein. Pulling back the curtain, the assignment is shown to be a functor between categories enriched over probabilities.

[1]  Dimitri Kartsaklis,et al.  Open System Categorical Quantum Semantics in Natural Language Processing , 2015, CALCO.

[2]  Eprint , 2019, Definitions.

[3]  Martha Lewis,et al.  Modelling Lexical Ambiguity with Density Matrices , 2020, CONLL.

[4]  P. Morse Annals of Physics , 1957, Nature.

[5]  Hugo Larochelle,et al.  MADE: Masked Autoencoder for Distribution Estimation , 2015, ICML.

[6]  Justin Reyes,et al.  A Multi-Scale Tensor Network Architecture for Classification and Regression , 2020, ArXiv.

[7]  Max Tegmark,et al.  Criticality in Formal Languages and Statistical Physics , 2016 .

[8]  Martha Lewis,et al.  Graded hyponymy for compositional distributional semantics , 2018, J. Lang. Model..

[9]  Peng Zhang,et al.  An adaptive contextual quantum language model , 2016 .

[10]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[11]  E. Miles Stoudenmire,et al.  Multi-scale tensor network architecture for machine learning , 2021, Mach. Learn. Sci. Technol..

[12]  Stephen Clark,et al.  Mathematical Foundations for a Compositional Distributional Model of Meaning , 2010, ArXiv.

[13]  C. Tsallis Entropy , 2022, Thermodynamic Weirdness.

[14]  Yiannis Vlassopoulos,et al.  Tensor network language model , 2017, ArXiv.

[15]  Nadav Cohen,et al.  On the Expressive Power of Deep Learning: A Tensor Analysis , 2015, COLT 2016.

[16]  J. R. Firth,et al.  A Synopsis of Linguistic Theory, 1930-1955 , 1957 .

[17]  Tai-Danae Bradley,et al.  At the Interface of Algebra and Statistics , 2020, ArXiv.

[18]  Peng Zhang,et al.  A Generalized Language Model in Tensor Space , 2019, AAAI.

[19]  F. William Lawvere,et al.  Taking categories seriously , 1986 .

[20]  F. William Lawvere,et al.  Metric spaces, generalized logic, and closed categories , 1973 .

[21]  Ángel J. Gallego,et al.  Language Design as Information Renormalization , 2017, SN Computer Science.

[22]  Sociedad Colombiana de Matemáticas Revista colombiana de matemáticas , 1967 .

[23]  John Martyn,et al.  Entanglement and Tensor Networks for Supervised Image Classification , 2020, ArXiv.

[24]  Martha Lewis,et al.  Graded Entailment for Compositional Distributional Semantics , 2016, ArXiv.

[25]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[26]  Roman Orus,et al.  A Practical Introduction to Tensor Networks: Matrix Product States and Projected Entangled Pair States , 2013, 1306.2164.

[27]  Yiannis Vlassopoulos,et al.  Language as a matrix product state , 2017, ArXiv.

[28]  Alec Radford,et al.  Scaling Laws for Neural Language Models , 2020, ArXiv.

[29]  J. Ignacio Cirac,et al.  From Probabilistic Graphical Models to Generalized Tensor Networks for Supervised Learning , 2018, IEEE Access.

[30]  J. Biamonte,et al.  Tensor Networks in a Nutshell , 2017, 1708.00006.

[31]  Daoyi Dong,et al.  Quantum Language Model With Entanglement Embedding for Question Answering , 2020, IEEE Transactions on Cybernetics.

[32]  Román Orús,et al.  Tensor networks for complex quantum systems , 2018, Nature Reviews Physics.

[33]  G. M. Kelly,et al.  BASIC CONCEPTS OF ENRICHED CATEGORY THEORY , 2022, Elements of ∞-Category Theory.

[34]  S. Willerton TIGHT SPANS, ISBELL COMPLETIONS AND SEMI-TROPICAL MODULES , 2013, 1302.4370.

[35]  E DeGiuli,et al.  Random Language Model: a path to principled complexity , 2018, Physical review letters.

[36]  Jacob Biamonte,et al.  Lectures on Quantum Tensor Networks , 2019, 1912.10049.

[37]  Kristie B. Hadden,et al.  2020 , 2020, Journal of Surgical Orthopaedic Advances.

[38]  K. N. Dollman,et al.  - 1 , 1743 .

[39]  Christopher T. Chubb,et al.  Hand-waving and interpretive dance: an introductory course on tensor networks , 2016, 1603.03039.

[40]  Ivan Oseledets,et al.  Tensor-Train Decomposition , 2011, SIAM J. Sci. Comput..

[41]  Alec Radford,et al.  Improving Language Understanding by Generative Pre-Training , 2018 .

[42]  Dawei Song,et al.  A Quantum Many-body Wave Function Inspired Language Modeling Approach , 2018, CIKM.

[43]  David J. Schwab,et al.  Supervised Learning with Quantum-Inspired Tensor Networks , 2016, ArXiv.

[44]  Dawei Song,et al.  End-to-End Quantum-like Language Models with Application to Question Answering , 2018, AAAI.

[45]  Fabio Tamburini,et al.  Towards Quantum Language Models , 2017, EMNLP.

[46]  U. Schollwoeck The density-matrix renormalization group in the age of matrix product states , 2010, 1008.3477.

[47]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[48]  G. Evenbly,et al.  Tensor Network States and Geometry , 2011, 1106.1082.

[49]  James Stokes,et al.  Probabilistic Modeling with Matrix Product States , 2019, Entropy.

[50]  Jonathan Elliott On the fuzzy concept complex , 2017 .

[51]  Jinhui Wang,et al.  Anomaly Detection with Tensor Networks , 2020, ArXiv.

[52]  Dimitri Kartsaklis,et al.  Sentence entailment in compositional distributional semantics , 2015, Annals of Mathematics and Artificial Intelligence.

[53]  Yoshua Bengio,et al.  Modeling term dependencies with quantum language models for IR , 2013, SIGIR.

[54]  Chu Guo,et al.  Matrix product operators for sequence-to-sequence learning , 2018, Physical Review E.

[55]  Guillaume Rabusseau,et al.  Tensor Networks for Probabilistic Sequence Modeling , 2020, AISTATS.

[56]  Mehrnoosh Sadrzadeh,et al.  Distributional Sentence Entailment Using Density Matrices , 2015, TTCS.

[57]  Massimo Melucci,et al.  CNM: An Interpretable Complex-valued Network for Matching , 2019, NAACL.

[58]  Steve Awodey,et al.  Category Theory , 2006 .