Novel script line identification method for script normalization and feature extraction in on-line handwritten whiteboard note recognition

When writing on a whiteboard, the writer stands rather than sits and the writing arm does not rest. Due to these adverse conditions when writing on a whiteboard, the script lines within the handwritten text suffer from high variations, i.e. they cannot be approximated by polynomials of low order. In this paper, we propose a novel method for identifying script lines in handwritten whiteboard notes by assigning the sample points of the script trajectory to the script lines. The optimal assignment is then found by the Viterbi algorithm. We present two ways to use the script line characterization. First, the script lines are used to normalize the skew and size of the text lines. In a second approach, the feature vector of a standard recognition system is augmented by the explicit script line membership of each sample point, aiming at reducing the confusions between characters differing in size rather than in shape (like ''s'' and ''S'' or ''e'' and ''l''). As experiments show, a relative improvement of r=3.3% in character-level and r=3.4% in word-level accuracy compared to a baseline system can be achieved with the proposed script line identification method. In addition, the written character confusion as described above can be reduced. Finally, the proposed utilizations are examined and discussed in further detail.

[1]  Ehud Rivlin,et al.  Offline cursive script word recognition – a survey , 1999, International Journal on Document Analysis and Recognition.

[2]  Alessandro Vinciarelli,et al.  A Survey On Off-Line Cursive Script Recognition , 2000 .

[3]  Horst Bunke,et al.  Off-line cursive handwriting recognition using hidden markov models , 1995, Pattern Recognit..

[4]  Yves Normandin Optimal splitting of HMM Gaussian mixture components with MMIE training , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[5]  Marcus Liwicki,et al.  IAM-OnDB - an on-line English sentence database acquired from handwritten text on a whiteboard , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[6]  Gernot A. Fink,et al.  Toward automatic video-based whiteboard reading , 2004, International Journal of Document Analysis and Recognition (IJDAR).

[7]  Sargur N. Srihari,et al.  On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Sargur N. Srihari,et al.  Off-Line Cursive Script Word Recognition , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Alexander H. Waibel,et al.  Online handwriting recognition: the NPen++ recognizer , 2001, International Journal on Document Analysis and Recognition.

[10]  Gernot A. Fink,et al.  Video-based whiteboard reading , 2005 .

[11]  Jerome R. Bellegarda,et al.  On-line handwriting recognition using continuous parameter hidden Markov models , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[12]  Marcus Liwicki,et al.  Combining On-Line and Off-Line Systems for Handwriting Recognition , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[13]  Yoshua Bengio,et al.  Word normalization for online handwritten word recognition , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).

[14]  Marcus Liwicki,et al.  HMM-Based On-Line Recognition of Handwritten Whiteboard Notes , 2006 .

[15]  Marcus Liwicki,et al.  Handwriting Recognition of Whiteboard Notes , 2005 .

[16]  Gerhard Rigoll,et al.  Line-Members - a Novel Feature in On-Line Whiteboard Note Recognition , 2008 .

[17]  Kenneth M. Sayre,et al.  Machine recognition of handwritten words: A project report , 1973, Pattern Recognit..

[18]  Christian Viard-Gaudin,et al.  From Off-line to On-line Handwriting Recognition , 2004 .

[19]  Gerhard Rigoll,et al.  Novel Hybrid NN/HMM Modelling Techniques for On-line Handwriting Recognition , 2006 .

[20]  Gernot A. Fink,et al.  Toward automatic video-based whiteboard reading , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[21]  Torsten Caesar,et al.  Preprocessing and feature extraction for a handwriting recognition system , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[22]  Horst Bunke,et al.  Recognition of cursive Roman handwriting: past, present and future , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[23]  Quentin Stafford-Fraser,et al.  BrightBoard: a video-augmented environment , 1996, CHI '96.

[24]  K. Takahashi,et al.  A fast HMM algorithm for on-line handwritten character recognition , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[25]  Gerhard Rigoll,et al.  On-Line Recognition of Handwritten Whiteboard Notes: A Novel Approach for Script Line Identification And Normalization , 2008 .

[26]  Gerhard Rigoll,et al.  Novel VQ Designs for Discrete HMM On-Line Handwritten Whiteboard Note Recognition , 2008, DAGM-Symposium.

[27]  Nikos Fakotakis,et al.  New algorithms for skewing correction and slant removal on word-level [OCR] , 1999, ICECS'99. Proceedings of ICECS '99. 6th IEEE International Conference on Electronics, Circuits and Systems (Cat. No.99EX357).

[28]  Horst Bunke,et al.  HMM-based handwritten word recognition: on the optimization of the number of states, training iterations and Gaussian components , 2004, Pattern Recognit..

[29]  No Value,et al.  Proceedings of the International Conference on Document Analysis and Recognition , 2003 .

[30]  Tanja Schultz,et al.  SMaRT: the Smart Meeting Room Task at ISL , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[31]  Alessandro Vinciarelli,et al.  A survey on off-line Cursive Word Recognition , 2002, Pattern Recognit..

[32]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[33]  Gerhard Rigoll,et al.  Segmentation and classification of meeting events using multiple classifier fusion and dynamic programming , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[34]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[35]  Darren Moore,et al.  The IDIAP Smart Meeting Room , 2002 .

[36]  Erkki Oja,et al.  Speeding up on-line recognition of handwritten characters by pruning the prototype set , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[37]  Andrew J. Viterbi,et al.  Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.

[38]  Gernot A. Fink,et al.  Camera-based Whiteboard Reading: New Approaches to a Challenging Task , 2008, ICFHR 2008.

[39]  Yoshua Bengio,et al.  Word normalization for on-line handwritten word recognition , 1994 .