Towards Formula Concept Discovery and Recognition

Citation-based Information Retrieval (IR) methods for scientific documents have proven to be effective in academic disciplines that use many references. In science, technology, engineering, and mathematics (STEM), researchers cite less often but employ mathematical concepts to refer to prior knowledge (Moed et al.). Our long-term goal is to generalize citation-based IR-methods and apply the generalized method to both classical references and mathematical concepts. In this paper, we suggest how mathematical formulae could be cited and define a Formula Concept Retrieval challenge with two subtasks: Formula Concept Discovery (FCD) and Formula Concept Recognition (FCR). While the former aims at the definition and exploration of a Formula Concept that names bundled equivalent representations of a formula, the latter is designed to match a given formula to a prior assigned concept ID. Moreover, we present first Machine Learning based approaches to tackle the FCD and FCR tasks, which we apply to a standardized test-collection (NTCIR arXiv dataset). Our FCD approach yields a recall of 68% for retrieving equivalent representations of frequent formulae, and 72% for extracting the formula name from the surrounding text. FCD and FCR will enable citing formulae within mathematical documents and facilitate semantic search as well as similarity computations for plagiarism detection or document recommender systems.

[1]  Cathleen S. Morawetz,et al.  Time decay for the nonlinear Klein-Gordon equation , 1968, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.

[2]  Moritz Schubotz,et al.  Introducing MathQA - A Math-Aware Question Answering System , 2018, Information Discovery and Delivery.

[3]  Hartmut Pecher,et al.  Nonlinear small data scattering for the wave and Klein-Gordon equation , 1984 .

[4]  Moritz Schubotz,et al.  Representing Mathematical Formulae in Content MathML using Wikidata , 2018, BIRNDL@SIGIR.

[5]  Claudio Sacerdoti Coen,et al.  A Survey on Retrieval of Mathematical Knowledge , 2016, Math. Comput. Sci..

[6]  K. M. Haroun,et al.  Derivation of Klein – Gordon Equation for Frictional Medium , 2017 .

[7]  Gerry McKiernan,et al.  arXiv.org: the Los Alamos National Laboratory e‐print server , 2000 .

[8]  Markus Krötzsch,et al.  Wikidata , 2014, Commun. ACM.

[9]  S. Detweiler KLEIN-GORDON EQUATION AND ROTATING BLACK HOLES , 1980 .

[10]  Bruce R. Miller,et al.  Deep Learning for Math Knowledge Processing , 2018, CICM.

[11]  Derivation of Klein-Gordon Equation from Maxwell's Equations and Study of Relativistic Time-Domain Waveguide Modes , 2010 .

[12]  W. Strauss,et al.  Numerical solution of a nonlinear Klein-Gordon equation , 1978 .

[13]  Marjorie A. McClain,et al.  Digital Repository of Mathematical Formulae , 2014, CICM.

[14]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[15]  A. Arbab Derivation of Dirac, Klein-Gordon, Schrödinger, diffusion and quantum heat transport equations from a universal quantum wave equation , 2010, 1007.1821.

[16]  André Greiner-Petter,et al.  Improving the Representation and Conversion of Mathematical Formulae by Considering their Textual Context , 2018, JCDL.

[17]  P. N. Kaloyerou,et al.  Evolution time Klein-Gordon equation and derivation of its nonlinear counterpart , 1989 .

[18]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[19]  Christophe Cruz,et al.  FORMALIZING SEMANTIC OF NATURAL LANGUAGE THROUGH CONCEPTUALIZATION FROM EXISTENCE , 2011 .

[20]  Iadh Ounis,et al.  NTCIR-11 Math-2 Task Overview , 2014, NTCIR.

[21]  Radim Řehůřek Scalability of Semantic Analysis in Natural Language Processing , 2011 .

[22]  Volker Markl,et al.  Semantification of Identifiers in Mathematics for Better Math Information Retrieval , 2016, SIGIR.

[23]  S. C. Tiwari Derivation of the Hamiltonian form of the Klein-Gordon equation from Schrödinger-Furth quantum diffusion theory: Comments , 1988 .

[24]  Daniel W. Lozier,et al.  NIST Digital Library of Mathematical Functions , 2003, Annals of Mathematics and Artificial Intelligence.