Advancing Math-Aware Search: The ARQMath-2 Lab at CLEF 2021

ARQMath-2 is a continuation of the ARQMath Lab at CLEF 2020, with two main tasks: (1) finding answers to mathematical questions among posted answers on a community question answering site (Math Stack Exchange), and (2) formula retrieval, where formulae in question posts serve as queries for formulae in earlier question and answer posts; the relevance of retrieved formulae considers the context of the posts in which query and retrieved formulae appear. The 2020 Lab created a large new test collection and established strong baselines for both tasks. Plans for ARQMath-2 includes extending the same test collection with additional topics, provision of standard components for optional use by teams new to the task, and post-hoc evaluation scripts to support tuning of new systems that did not contribute to the 2020 judgment pools.

[1]  Iadh Ounis,et al.  NTCIR-11 Math-2 Task Overview , 2014, NTCIR.

[2]  Douglas W. Oard,et al.  Finding Old Answers to New Math Questions: The ARQMath Lab at CLEF 2020 , 2020, ECIR.

[3]  Aniket Kittur,et al.  Collaborative problem solving: a study of MathOverflow , 2014, CSCW.

[4]  Douglas W. Oard,et al.  Overview of ARQMath 2020: CLEF Lab on Answer Retrieval for Questions on Math , 2020, CLEF.

[5]  Katharina Morik,et al.  Semantic Search in Millions of Equations , 2020, KDD.

[6]  Michihiro Yasunaga,et al.  TopicEq: A Joint Topic and Mathematical Equation Model for Scientific Texts , 2019, AAAI.

[7]  Noriko Kando,et al.  On information retrieval metrics designed for evaluation with incomplete relevance assessments , 2008, Information Retrieval.

[8]  Iadh Ounis,et al.  NTCIR-12 MathIR Task Overview , 2016, NTCIR.

[9]  Douglas W. Oard,et al.  DPRL Systems in the CLEF 2020 ARQMath Lab , 2020, CLEF.

[10]  C. L. Giles,et al.  Accelerating Substructure Similarity Search for Formula Retrieval , 2020, ECIR.

[11]  Douglas W. Oard,et al.  Tangent-CFT: An Embedding Model for Mathematical Formulas , 2019, ICTIR.

[12]  Dallas J. Fraser,et al.  Choosing Math Features for BM25 Ranking with Tangent-L , 2018, DocEng.

[13]  Iadh Ounis,et al.  NTCIR-10 Math Pilot Task Overview , 2013, NTCIR.

[14]  Douglas W. Oard,et al.  Characterizing Searches for Mathematical Concepts , 2019, 2019 ACM/IEEE Joint Conference on Digital Libraries (JCDL).

[15]  Kenny Davila,et al.  Layout and Semantics: Combining Representations for Mathematical Formula Search , 2017, SIGIR.