Deep Learning for Math Knowledge Processing

The vast and fast-growing STEM literature makes it imperative to develop systems for automated math-semantics extraction from technical content, and for semantically-enabled processing of such content. Grammar-based techniques alone are inadequate for the task. We present a new project for using deep learning (DL) for that purpose. It will explore a number of DL and representation-learning models, which have shown superior performance in applications that involve sequences of data. As math and science involve sequences of text, symbols and equations, such as deep learning models are expected to deliver good performance in math-semantics extraction and processing.

[1]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[2]  Jason Weston,et al.  Joint Learning of Words and Meaning Representations for Open-Text Semantic Parsing , 2012, AISTATS.

[3]  Wolfram Sperber,et al.  POS Tagging and Its Applications for Mathematics - Text Analysis in Mathematics , 2014, CICM.

[4]  James H. Martin,et al.  Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2000, Prentice Hall series in artificial intelligence.

[5]  Michael Kohlhase,et al.  An Architecture for Linguistic and Semantic Analysis on the arXMLiv Corpus , 2009, GI Jahrestagung.

[6]  Yoshua Bengio,et al.  Gated Feedback Recurrent Neural Networks , 2015, ICML.

[7]  Richard M. Schwartz,et al.  Fast and Robust Neural Network Joint Models for Statistical Machine Translation , 2014, ACL.

[8]  Jun Zhao,et al.  How to Generate a Good Word Embedding , 2015, IEEE Intelligent Systems.

[9]  Claudio Sacerdoti Coen,et al.  A Survey on Retrieval of Mathematical Knowledge , 2015, Mathematics in Computer Science.

[10]  Francisco Herrera,et al.  Prototype Selection for Nearest Neighbor Classification: Taxonomy and Empirical Study , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[12]  Yoshua Bengio,et al.  On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[13]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[14]  Kevin Chen,et al.  Semantic Preserving Bijective Mappings of Mathematical Formulae Between Document Preparation Systems and Computer Algebra Systems , 2017, CICM.

[15]  Masakazu Suzuki,et al.  Mathematical symbol recognition with support vector machines , 2007, Pattern Recognit. Lett..

[16]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[17]  Volker Markl,et al.  Semantification of Identifiers in Mathematics for Better Math Information Retrieval , 2016, SIGIR.

[18]  Yue Yin,et al.  Preliminary Exploration of Formula Embedding for Mathematical Information Retrieval: can mathematical formulae be embedded like a natural language? , 2017, ArXiv.

[19]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[20]  Arnold Neumaier,et al.  A Framework for Representing and Processing Arbitrary Mathematics , 2010, KEOD.

[21]  Timothy Baldwin,et al.  An Empirical Evaluation of doc2vec with Practical Insights into Document Embedding Generation , 2016, Rep4NLP@ACL.

[22]  Akiko Aizawa,et al.  Mining Coreference Relations between Formulas and Text using Wikipedia , 2010 .

[23]  Douwe Kiela,et al.  Poincaré Embeddings for Learning Hierarchical Representations , 2017, NIPS.

[24]  Alex Graves,et al.  Supervised Sequence Labelling with Recurrent Neural Networks , 2012, Studies in Computational Intelligence.

[25]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[26]  Abdou Youssef,et al.  Part-of-Math Tagging and Applications , 2017, CICM.

[27]  Roberto Navigli,et al.  Word sense disambiguation: A survey , 2009, CSUR.

[28]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[29]  Stephen M. Watt Conserving Implicit Mathematical Semantics in Conversion between TEX and MathML , 2003 .

[30]  Yann LeCun,et al.  Convolutional networks and applications in vision , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.

[31]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[32]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[33]  Felix Naumann,et al.  Data fusion , 2009, CSUR.

[34]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[35]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[36]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[37]  N. Altman An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression , 1992 .

[38]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[39]  Michael Kohlhase,et al.  Using Discourse Context to Interpret Object-Denoting Mathematical Expressions , 2011 .

[40]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[41]  Yoshua Bengio,et al.  Deep Learning of Representations: Looking Forward , 2013, SLSP.

[42]  David M. Blei,et al.  Structured Embedding Models for Grouped Data , 2017, NIPS.

[43]  Jacques Carette,et al.  A Review of Mathematical Knowledge Management , 2009, Calculemus/MKM.