Stroke order normalization for improving recognition of online handwritten mathematical expressions

AbstractWe present a technique based on stroke order normalization for improving recognition of online handwritten mathematical expressions (ME). The stroke order dependent system has less time complexity than the stroke order free system, but it must incorporate special grammar rules to cope with stroke order variations. The stroke order normalization technique solves this problem and also the problem of unexpected stroke order variations without increasing the time complexity of ME recognition. In order to normalize stroke order, the X–Y cut method is modified since its original form causes problems when structural components in ME overlap. First, vertically ordered strokes are located by detecting vertical symbols and their upper/lower components, which are treated as MEs and reordered recursively. Second, unordered strokes on the left side of the vertical symbols are reordered as horizontally ordered strokes. Third, the remaining strokes are reordered recursively. The horizontally ordered strokes are reordered from left to right, and the vertically ordered strokes are reordered from top to bottom. Finally, the proposed stroke order normalization is combined with the stroke order dependent ME recognition system. The evaluations on the CROHME 2014 database show that the ME recognition system incorporating the stroke order normalization outperforms all other systems that use only CROHME 2014 for training while the processing time is kept low.

[1]  George Nagy,et al.  HIERARCHICAL REPRESENTATION OF OPTICALLY SCANNED DOCUMENTS , 1984 .

[2]  Masaki Nakagawa,et al.  Deep neural networks for recognizing online handwritten mathematical symbols , 2015, 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR).

[3]  Jean-Luc Meunier,et al.  Optimized XY-cut for determining a page reading order , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[4]  Richard Zanibbi,et al.  Recognition and retrieval of mathematical expressions , 2011, International Journal on Document Analysis and Recognition (IJDAR).

[5]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[6]  Joan-Andreu Sánchez,et al.  Offline Features for Classifying Handwritten Math Symbols with Recurrent Neural Networks , 2014, 2014 22nd International Conference on Pattern Recognition.

[7]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[8]  Masakazu Suzuki,et al.  Mathematical formula recognition using virtual link network , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[9]  Masaki Nakagawa,et al.  A system for recognizing online handwritten mathematical expressions by using improved structural analysis , 2016, International Journal on Document Analysis and Recognition (IJDAR).

[10]  Masakazu Suzuki,et al.  Identifying Subscripts and Superscripts in Mathematical Documents , 2008, Math. Comput. Sci..

[11]  Bidyut Baran Chaudhuri,et al.  Recognition of online handwritten mathematical expressions , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[12]  Richard Zanibbi,et al.  Recognizing Mathematical Expressions Using Tree Transformation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Fotini Simistira,et al.  Recognition of online handwritten mathematical formulas using probabilistic SVMs and stochastic context free grammars , 2015, Pattern Recognit. Lett..

[14]  Masaki Nakagawa,et al.  A System for Recognizing Online Handwritten Mathematical Expressions and Improvement of Structure Analysis , 2014, 2014 11th IAPR International Workshop on Document Analysis Systems.

[15]  Masaki Nakagawa,et al.  Recognition of Online Handwritten Math Symbols Using Deep Neural Networks , 2016, IEICE Trans. Inf. Syst..

[16]  Stephanie Ludi,et al.  Using Off-Line Features and Synthetic Data for On-Line Handwritten Math Symbol Recognition , 2014, 2014 14th International Conference on Frontiers in Handwriting Recognition.

[17]  Joan-Andreu Sánchez,et al.  Classification of On-Line Mathematical Symbols with Hybrid Features and Recurrent Neural Networks , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[18]  Hsi-Jian Lee,et al.  Design of a mathematical expression understanding system , 1997, Pattern Recognit. Lett..

[19]  Tomoichi Takahashi,et al.  A study of symbol segmentation method for handwritten mathematical formula recognition using mathematical structure information , 2004, ICPR 2004.

[20]  George Labahn,et al.  A new approach for recognizing handwritten mathematics using relational grammars and fuzzy sets , 2013, International Journal on Document Analysis and Recognition (IJDAR).

[21]  Masaki Nakagawa,et al.  Training an End-to-End System for Handwritten Mathematical Expression Recognition by Generated Patterns , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[22]  Masaki Nakagawa,et al.  Modified X-Y Cut for Re-Ordering Strokes of Online Handwritten Mathematical Expressions , 2016, 2016 12th IAPR Workshop on Document Analysis Systems (DAS).

[23]  George Labahn,et al.  Elastic matching in linear time and constant space , 2009 .

[24]  Shiliang Zhang,et al.  Watch, attend and parse: An end-to-end neural network based approach to handwritten mathematical expression recognition , 2017, Pattern Recognit..

[25]  Dit-Yan Yeung,et al.  Mathematical expression recognition: a survey , 2000, International Journal on Document Analysis and Recognition.

[26]  Ryo Yamamoto,et al.  Stroke-Based Stochastic Context-Free Grammar for On-line Handwritten Mathematical Expression Recognition , 2006 .

[27]  Manfred K. Lang,et al.  A soft-decision approach for symbol segmentation within handwritten mathematical expressions , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[28]  Lei Hu,et al.  Segmenting Handwritten Math Symbols Using AdaBoost and Multi-scale Shape Context Features , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[29]  Joan-Andreu Sánchez,et al.  Recognition of on-line handwritten mathematical expressions using 2D stochastic context-free grammars and hidden Markov models , 2014, Pattern Recognit. Lett..

[30]  Lei Hu,et al.  HMM-Based Recognition of Online Handwritten Mathematical Symbols Using Segmental K-Means Initialization and a Modified Pen-Up/Down Feature , 2011, 2011 International Conference on Document Analysis and Recognition.

[31]  Harold Mouchère,et al.  ICFHR 2014 Competition on Recognition of On-Line Handwritten Mathematical Expressions (CROHME 2014) , 2014, 2014 14th International Conference on Frontiers in Handwriting Recognition.

[32]  Alexander M. Rush,et al.  Image-to-Markup Generation with Coarse-to-Fine Attention , 2016, ICML.