VMEXT: A Visualization Tool for Mathematical Expression Trees

Mathematical expressions can be represented as a tree consisting of terminal symbols, such as identifiers or numbers (leaf nodes), and functions or operators (non-leaf nodes). Expression trees are an important mechanism for storing and processing mathematical expressions as well as the most frequently used visualization of the structure of mathematical expressions. Typically, researchers and practitioners manually visualize expression trees using general-purpose tools. This approach is laborious, redundant, and error-prone. Manual visualizations represent a user's notion of what the markup of an expression should be, but not necessarily what the actual markup is. This paper presents VMEXT - a free and open source tool to directly visualize expression trees from parallel MathML. VMEXT simultaneously visualizes the presentation elements and the semantic structure of mathematical expressions to enable users to quickly spot deficiencies in the Content MathML markup that does not affect the presentation of the expression. Identifying such discrepancies previously required reading the verbose and complex MathML markup. VMEXT also allows one to visualize similar and identical elements of two expressions. Visualizing expression similarity can support support developers in designing retrieval approaches and enable improved interaction concepts for users of mathematical information retrieval systems. We demonstrate VMEXT's visualizations in two web-based applications. The first application presents the visualizations alone. The second application shows a possible integration of the visualizations in systems for mathematical knowledge management and mathematical information retrieval. The application converts LaTeX input to parallel MathML, computes basic similarity measures for mathematical expressions, and visualizes the results using VMEXT.

[1]  Frank Wm. Tompa,et al.  Structural Similarity Search for Mathematics Retrieval , 2013, MKM/Calculemus/DML.

[2]  Andreas Nürnberger,et al.  Web-based Demonstration of Semantic Similarity Detection Using Citation Pattern Visualization for a Cross Language Plagiarism Case , 2014, ICEIS.

[4]  Abdou Youssef,et al.  A Math Query Language with an Expanded Set of Wildcards , 2008, Math. Comput. Sci..

[5]  Frank Wm. Tompa,et al.  A new mathematics retrieval system , 2010, CIKM '10.

[6]  Kim Marriott,et al.  Constituent Structure in Mathematical Expressions , 2000 .

[7]  Marjorie A. McClain,et al.  Digital Repository of Mathematical Formulae , 2014, CICM.

[8]  Claudio Sacerdoti Coen,et al.  A Survey on Retrieval of Mathematical Knowledge , 2015, Mathematics in Computer Science.

[9]  Ben Shneiderman,et al.  The eyes have it: a task by data type taxonomy for information visualizations , 1996, Proceedings 1996 IEEE Symposium on Visual Languages.

[10]  V. Sorge,et al.  Towards Meaningful Visual Abstraction of Mathematical Notation , 2015 .

[11]  Norman Meuschke,et al.  Reducing computational effort for plagiarism detection by using citation characteristics to limit retrieval space , 2014, IEEE/ACM Joint Conference on Digital Libraries.

[12]  Norman Meuschke,et al.  Citation‐based plagiarism detection: Practicability on a large‐scale scientific corpus , 2014, J. Assoc. Inf. Sci. Technol..

[13]  Marjorie A. McClain,et al.  Growing the Digital Repository of Mathematical Formulae with Generic LaTeX Sources , 2015, ArXiv.

[14]  Petr Sojka,et al.  The art of mathematics retrieval , 2011, DocEng '11.

[15]  Hiroaki Saito,et al.  Partial-match Retrieval with Structure-reflected Indices at the NTCIR-10 Math Task , 2013, NTCIR.

[16]  Qun Zhang,et al.  An Approach to Math-Similarity Search , 2014, CICM.

[17]  Moritz Schubotz,et al.  Mathoid: Robust, Scalable, Fast and Accessible Math Rendering for Wikipedia , 2014, CICM.

[18]  Volker Markl,et al.  Evaluation of Similarity-Measure Factors for Formulae Based on the NTCIR-11 Math Task , 2014, NTCIR.

[19]  Richard Zanibbi,et al.  Recognition and retrieval of mathematical expressions , 2011, International Journal on Document Analysis and Recognition (IJDAR).

[20]  Volker Markl,et al.  Semantification of Identifiers in Mathematics for Better Math Information Retrieval , 2016, SIGIR.

[21]  Frank Wm. Tompa,et al.  Improving Mathematics Retrieval , 2009 .

[22]  Marjorie A. McClain,et al.  Growing the Digital Repository of Mathematical Formulae with Generic Sources , 2015, CICM.

[23]  Bela Gipp,et al.  Citation-based Plagiarism Detection , 2014, Springer Fachmedien Wiesbaden.

[24]  Hideki Hashimoto,et al.  Incorporating breadth first search for indexing MathML objects , 2008, 2008 IEEE International Conference on Systems, Man and Cybernetics.

[25]  Rajesh Munavalli,et al.  An Approach to Mathematical Search Through Query Formulation and Data Normalization , 2007, Calculemus/MKM.

[26]  Akiko Aizawa,et al.  An Approach to Similarity Search for Mathematical Expressions using MathML , 2009 .

[27]  Abdou Youssef,et al.  Equivalence detection using parse-tree normalization for math search , 2007, 2007 2nd International Conference on Digital Information Management.

[28]  Andreas Nürnberger,et al.  Demonstration of citation pattern analysis for plagiarism detection , 2013, SIGIR.

[29]  Norman Meuschke,et al.  CitePlag : A Citation-based Plagiarism Detection System Prototype , 2012 .

[30]  Bruce R. Miller Strategies for Parallel Markup , 2015, CICM.

[31]  Kim Marriott,et al.  Perceiving structure in mathematical expressions , 1999 .

[32]  Norman Meuschke,et al.  State-of-the-art in detecting academic plagiarism , 2013 .

[33]  Moritz Schubotz,et al.  Augmenting Mathematical Formulae for More Effective Querying & Efficient Presentation , 2017 .

[34]  Richard Zanibbi,et al.  Combining TF-IDF Text Retrieval with an Inverted Index over Symbol Pairs in Math Expressions: The Tangent Math Search Engine at NTCIR 2014 , 2014, NTCIR.