Extracting Definitions of Mathematical Expressions in Scientific Papers (人工知能学会全国大会(第26回)文化,科学技術と未来) -- (International Organized Session「Alan Turing Year Special Session on AI Research That Can Change The World」)

Natural language definitions of mathematical expressions are essential for understanding the mathematical content of scientific papers. A textual description corresponding to a mathematical expression determines the type of symbol or function and the specific name for reference. Our objective is to create an automatic way of extracting definitions of mathematical expressions. We needed to create an annotated corpus since there was no annotated data available on relations between mathematical expressions and their definitions and such annotated data would enable us to compare different approaches to the relation extraction task. This paper introduces guidelines for annotating definitions of mathematical expressions. By using 14 manually annotated papers from Springer, we investigated pattern matching and machine learning based methods in comparison with naive practice based on the nearest noun of the preceding text. The result shows potential of our approach in detecting definitions and the usefulness of our annotated data.