Effects of context, complexity, and clustering on evaluation for math formula retrieval

There are now several test collections for the formula retrieval task, in which a system’s goal is to identify useful mathematical formulae to show in response to a query posed as a formula. These test collections differ in query format, query complexity, number of queries, content source, and relevance definition. Comparisons among six formula retrieval test collections illustrate that defining relevance based on query and/or document context can be consequential, that system results vary markedly with formula complexity, and that judging relevance after clustering formulas with identical symbol layouts (i.e., Symbol Layout Trees) can affect system preference ordering.

[1]  Kenny Davila,et al.  Layout and Semantics: Combining Representations for Mathematical Formula Search , 2017, SIGIR.

[2]  Douglas W. Oard,et al.  Overview of ARQMath-2 (2021): Second CLEF Lab on Answer Retrieval for Questions on Math (Working Notes Version) , 2021, CLEF.

[3]  Iadh Ounis,et al.  NTCIR-10 Math Pilot Task Overview , 2013, NTCIR.

[4]  Kenny Davila,et al.  The MathDeck Formula Editor: Interactive Formula Entry Combining LaTeX , Structure Editing, and Search , 2021, CHI Extended Abstracts.

[5]  Iadh Ounis,et al.  NTCIR-12 MathIR Task Overview , 2016, NTCIR.

[6]  Tetsuya Sakai,et al.  Alternatives to Bpref , 2007, SIGIR.

[7]  Iadh Ounis,et al.  NTCIR-11 Math-2 Task Overview , 2014, NTCIR.

[8]  Douglas W. Oard,et al.  DPRL Systems in the CLEF 2020 ARQMath Lab , 2020, CLEF.

[9]  Douglas W. Oard,et al.  Learning to Rank for Mathematical Formula Retrieval , 2021, SIGIR.

[10]  Douglas W. Oard,et al.  Tangent-CFT: An Embedding Model for Mathematical Formulas , 2019, ICTIR.

[11]  Douglas W. Oard,et al.  Overview of ARQMath 2020: CLEF Lab on Answer Retrieval for Questions on Math , 2020, CLEF.

[12]  Richard Zanibbi,et al.  XY-PHOC Symbol Location Embeddings for Math Formula Retrieval and Autocompletion , 2021, CLEF.

[13]  Douglas W. Oard,et al.  Characterizing Searches for Mathematical Concepts , 2019, 2019 ACM/IEEE Joint Conference on Digital Libraries (JCDL).

[14]  Wei Zhong,et al.  Structural Similarity Search for Formulas Using Leaf-Root Paths in Operator Subtrees , 2019, ECIR.