Comparison of Bernoulli and Gaussian HMMs Using a Vertical Repositioning Technique for Off-Line Handwriting Recognition

In this paper a vertical repositioning method based on the center of gravity is investigated for handwriting recognition systems and evaluated on databases containing Arabic and French handwriting. Experiments show that vertical distortion in images has a large impact on the performance of HMM based handwriting recognition systems. Recently good results were obtained with Bernoulli HMMs (BHMMs) using a preprocessing with vertical repositioning of binarized images. In order to isolate the effect of the preprocessing from the BHMM model, experiments were conducted with Gaussian HMMs and the LSTM-RNN tandem HMM approach with relative improvements of 33% WER on the Arabic and up to 62% on the French database.

[1]  Volker Märgner,et al.  Arabic Handwriting Recognition Competition , 2005, ICDAR.

[2]  Jürgen Schmidhuber,et al.  Multi-dimensional Recurrent Neural Networks , 2007, ICANN.

[3]  Jonathan Le Roux,et al.  Discriminative Training for Large-Vocabulary Speech Recognition Using Minimum Classification Error , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  Jürgen Schmidhuber,et al.  Multidimensional Recurrent Neural Networks , 2007 .

[5]  T. Munich,et al.  Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks , 2008, NIPS.

[6]  Volker Märgner,et al.  ICDAR 2011 - Arabic Handwriting Recognition Competition , 2011, ICDAR.

[7]  Wolfgang Macherey,et al.  Comparison of discriminative training criteria , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[8]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[9]  Emmanuel Augustin,et al.  RIMES evaluation campaign for handwritten mail processing , 2006 .

[10]  Hermann Ney,et al.  White-space models for offline Arabic handwriting recognition , 2008, 2008 19th International Conference on Pattern Recognition.

[11]  Martin A. Riedmiller,et al.  A direct adaptive method for faster backpropagation learning: the RPROP algorithm , 1993, IEEE International Conference on Neural Networks.

[12]  Alfons Juan-Císcar,et al.  Windowed Bernoulli Mixture HMMs for Arabic Handwritten Word Recognition , 2010, 2010 12th International Conference on Frontiers in Handwriting Recognition.

[13]  Steve Young,et al.  The HTK book , 1995 .

[14]  Robert Sabourin,et al.  Large vocabulary off-line handwriting recognition: A survey , 2003, Pattern Analysis & Applications.

[15]  Volker Märgner,et al.  ICFHR 2010 - Arabic Handwriting Recognition Competition , 2010, 2010 12th International Conference on Frontiers in Handwriting Recognition.

[16]  Alfons Juan-Císcar,et al.  Discriminative Bernoulli Mixture Models for Handwritten Digit Recognition , 2011, 2011 International Conference on Document Analysis and Recognition.

[17]  Daniel P. W. Ellis,et al.  Tandem connectionist feature extraction for conventional HMM systems , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[18]  Marcus Liwicki,et al.  A novel approach to on-line handwriting recognition based on bidirectional long short-term memory networks , 2007 .