Recognition of cursive Roman handwriting: past, present and future

This paper reviews the state of the art in off-line Roman cursive handwriting recognition. The input provided to an off-line handwriting recognition system is an image of a digit, a word, or - more generally -some text, and the system produces, as output, an ASCII transcription of the input. This task involves a number of processing steps, some of which are quite difficult. Typically, preprocessing, normalization, feature extraction, classification, and postprocessing operations are required. We'll survey the state of the art, analyze recent trends, and try to identify challenges for future research in this field.

[1]  Emmanuel Augustin,et al.  Industrial bank check processing: the A2iA CheckReaderTM , 2001, International Journal on Document Analysis and Recognition.

[2]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[3]  Dave Elliman,et al.  A truthing tool for generating a database of cursive words , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[4]  Sargur N. Srihari,et al.  Decision Combination in Multiple Classifier Systems , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Ehud Rivlin,et al.  Offline cursive script word recognition – a survey , 1999, International Journal on Document Analysis and Recognition.

[6]  Eric Lecolinet,et al.  A Survey of Methods and Strategies in Character Segmentation , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Jean-Michel Bertille,et al.  Handwritten Word Recognition with Contextual Hidden Markov Models , 1999 .

[8]  Michael C. Fairhurst,et al.  Genetic Algorithms for Multi-classifier System Configuration: A Case Study in Character Recognition , 2001, Multiple Classifier Systems.

[9]  John T. Favata Offline General Handwritten Word Recognition Using an Approximate BEAM Matching Algorithm , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Fuad Rahman,et al.  Multiple expert classification: a new methodology for parallel decision fusion , 2000, International Journal on Document Analysis and Recognition.

[11]  David S. Doermann,et al.  Progress in camera-based document image analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[12]  Amlan Kundu,et al.  HANDWRITTEN WORD RECOGNITION USING HIDDEN MARKOV MODEL , 1997 .

[13]  Jürgen Schürmann,et al.  Pattern classification , 2008 .

[14]  Sebastiano Impedovo,et al.  A new database for research on bank-check processing , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[15]  Úúò Blockin Off-Line Cursive Script Recognition Based on Continuous Density HMM , 2000 .

[16]  Makoto Yasuhara,et al.  Recovery of Drawing Order from Single-Stroke Handwriting Images , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Gerhard Rigoll COMBINATION OF HIDDEN MARKOV MODELS AND NEURAL NETWORKS FOR HYBRID STATISTICAL PATTERN RECOGNITION , 2002 .

[18]  Venu Govindaraju,et al.  Local reference lines for handwritten phrase recognition , 1999, Pattern Recognit..

[19]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Lawrence O'Gorman,et al.  Document Image Analysis , 1996 .

[21]  Horst Bunke,et al.  Automated Reading of Cheque Amounts , 2000, Pattern Analysis & Applications.

[22]  John Bennett,et al.  The effect of large training set sizes on online Japanese Kanji and English cursive recognizers , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[23]  Isabelle Guyon Handwriting Synthesis From Handwritten Glyphs , 1996 .

[24]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[25]  Sargur N. Srihari Handwritten Address Interpretation: A Task of Many Pattern Recognition Problems , 2000, Int. J. Pattern Recognit. Artif. Intell..

[26]  Horst Bunke,et al.  Writer identification using text line based features , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[27]  Horst Bunke,et al.  On the influence of vocabulary size and language models in unconstrained handwritten text recognition , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[28]  Samy Bengio,et al.  Torch: a modular machine learning software library , 2002 .

[29]  Bin Zhang,et al.  Transcript mapping for historic handwritten document images , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[30]  Horst Bunke,et al.  Off-line handwritten numeral string recognition by combining segmentation-based and segmentation-free methods , 1998, Pattern Recognit..

[31]  Robert M. Davison,et al.  GSS for presentation support , 2000, CACM.

[32]  J.-C. Simon,et al.  Off-line cursive word recognition , 1992, Proc. IEEE.

[33]  Sharath Pankanti,et al.  BIOMETRIC IDENTIFICATION , 2000 .

[34]  Fuad Rahman,et al.  Multiple Classifier Combination for Character Recognition: Revisiting the Majority Voting System and Its Variations , 2002, Document Analysis Systems.

[35]  Flávio Bortolozzi,et al.  Mathematical morphology and weighted least squares to correct handwriting baseline skew , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[36]  Leonardo Maria Reyneri,et al.  Beatrix: A self-learning system for off-line recognition of handwritten texts , 1997, Pattern Recognit. Lett..

[37]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[38]  Rafael Llobet,et al.  Training Set Expansion in Handwritten Character Recognition , 2002, SSPR/SPR.

[39]  Sung-Hyuk Cha,et al.  Approximate stroke sequence string matching algorithm for character recognition and analysis , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[40]  Shimon Ullman,et al.  Reading cursive handwriting by alignment of letter prototypes , 1991, International Journal of Computer Vision.

[41]  Ching Y. Suen,et al.  KMOD - a new support vector machine kernel with moderate decreasing for pattern recognition. Application to digit image recognition , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[42]  Makoto Kobayashi,et al.  Off-line character recognition using HMM by multiple directional feature extraction and voting with bagging algorithm , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[43]  Azriel Rosenfeld,et al.  Recovery of temporal information from static images of handwriting , 2005, International Journal of Computer Vision.

[44]  Mario Vento,et al.  Subgraph Transformations for the Inexact Matching of Attributed Relational Graphs , 1997, GbRPR.

[45]  Robert Sabourin,et al.  A hybrid large vocabulary handwritten word recognition system using neural networks with hidden Markov models , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[46]  Horst Bunke,et al.  Off-line cursive handwriting recognition using hidden markov models , 1995, Pattern Recognit..

[47]  Simon M. Lucas,et al.  Recognition of chain-coded handwritten character images with scanning n-tuple method , 1995 .

[48]  Sung-Hyuk Cha,et al.  MULTIPLE FEATURE INTEGRATION FOR WRITER VERIFICATION , 2004 .

[49]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[50]  Patrick J. Grother,et al.  The First Census Optical Character Recognition Systems Conference | NIST , 1992 .

[51]  Claudio M. Privitera,et al.  The segmentation of cursive handwriting: an approach based on off-line recovery of the motor-temporal information , 1999, IEEE Trans. Image Process..

[52]  A. Brakensiek,et al.  OFF-LINE HANDWRITING RECOGNITION USING VARIOUS HYBRID MODELING TECHNIQUES AND CHARACTER N-GRAMS , 2004 .

[53]  Yang He,et al.  Alternatives to Variable Duration HMM in Handwriting Recognition , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[54]  Horst Bunke,et al.  Text line segmentation and word recognition in a system for general writer independent handwriting recognition , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[55]  Robert Sabourin,et al.  An HMM-Based Approach for Off-Line Unconstrained Handwritten Word Modeling and Recognition , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[56]  R. Rosenfeld,et al.  Two decades of statistical language modeling: where do we go from here? , 2000, Proceedings of the IEEE.

[57]  Horst Bunke,et al.  Off-Line, Handwritten Numeral Recognition by Perturbation Method , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[58]  Horst Bunke,et al.  Creation of classifier ensembles for handwritten word recognition using feature selection algorithms , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[59]  Amit Roy A Framework of Combining Numeric String Recognizers , 2001 .

[60]  Paul D. Gader,et al.  Fusion of handwritten word classifiers , 1996, Pattern Recognit. Lett..

[61]  Harry Shum,et al.  Learning-based cursive handwriting synthesis , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[62]  Paul D. Gader,et al.  Lexicon-Driven Handwritten Word Recognition Using Optimal Linear Combinations of Order Statistics , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[63]  Ching Y. Suen,et al.  Building a perception based model for reading cursive script , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[64]  Horst Bunke,et al.  Using a Statistical Language Model to Improve the Performance of an HMM-Based Cursive Handwriting Recognition System , 2001, Int. J. Pattern Recognit. Artif. Intell..

[65]  Abdel Belaïd,et al.  Cross-learning in analytic word recognition without segmentation , 2002, International Journal on Document Analysis and Recognition.

[66]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[67]  Horst Bunke,et al.  Generation and Use of Synthetic Training Data in Cursive Handwriting Recognition , 2003, IbPRIA.

[68]  Jhing-Fa Wang,et al.  Segmentation of Single- or Multiple-Touching Handwritten Numeral String Using Background and Foreground Analysis , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[69]  Jonathan J. Hull,et al.  A Database for Handwritten Text Recognition Research , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[70]  Samy Bengio,et al.  Writer adaptation techniques in HMM based Off-Line Cursive Script Recognition , 2002, Pattern Recognit. Lett..

[71]  Isabelle Guyon,et al.  UNIPEN project of on-line data exchange and recognizer benchmarks , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).

[72]  Venu Govindaraju,et al.  Generating manifold samples from a handwritten word , 1994, Pattern Recognit. Lett..

[73]  Henry S. Baird,et al.  Document image defect models , 1995 .

[74]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[75]  Ching Y. Suen,et al.  Analysis and recognition of Asian scripts-the state of the art , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[76]  Tieniu Tan,et al.  Personal identification based on handwriting , 2000, Pattern Recognit..

[77]  Alessandro Vinciarelli,et al.  A survey on off-line Cursive Word Recognition , 2002, Pattern Recognit..

[78]  Stefan Knerr,et al.  The IRESTE On/Off (IRONOFF) dual handwriting database , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[79]  Isabelle Guyon,et al.  DATA SETS FOR OCR AND DOCUMENT IMAGE UNDERSTANDING RESEARCH , 1997 .

[80]  Réjean Plamondon,et al.  Automatic Signature Verification: The State of the Art - 1989-1993 , 1994, Int. J. Pattern Recognit. Artif. Intell..

[81]  Tao Hong,et al.  Text recognition enhancement with a probabilistic lattice chart parser , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[82]  Emmanuel Augustin,et al.  Hidden Markov Model Based Word Recognition and Its Application to Legal Amount Reading on French Checks , 1998, Comput. Vis. Image Underst..

[83]  Horst Bunke,et al.  Automatic segmentation of the IAM off-line database for handwritten English text , 2002, Object recognition supported by user interaction for service robots.

[84]  Proceedings Seventh International Conference on Document Analysis and Recognition , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[85]  Ulrich Kressel,et al.  PATTERN CLASSIFICATION TECHNIQUES BASED ON FUNCTION APPROXIMATION , 1997 .

[86]  Volker Märgner,et al.  Synthetic data for Arabic OCR system development , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[87]  Sargur N. Srihari,et al.  On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[88]  Michel Gilloux,et al.  Strategies for handwritten words recognition using hidden Markov models , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[89]  Horst Bunke,et al.  IMAGE PROCESSING METHODS FOR DOCUMENT IMAGE ANALYSIS , 1997 .

[90]  M. Shridhar,et al.  SEGMENTATION-BASED CURSIVE HANDWRITING RECOGNITION , 1997 .

[91]  Johansson. Stig,et al.  Manual of information to accompany the Lancaster-Oslo : Bergen Corpus of British English, for use with digital computers , 1978 .

[92]  Hong Yan,et al.  Off-line signature verification using structural feature correspondence , 2002, Pattern Recognit..

[93]  Sargur N. Srihari,et al.  Off-Line Cursive Script Word Recognition , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[94]  Hiroshi Sako,et al.  Performance evaluation of pattern classifiers for handwritten character recognition , 2002, International Journal on Document Analysis and Recognition.

[95]  Horst Bunke,et al.  A System for the Automated Reading of Check Amounts - Some Key Ideas , 1998, Document Analysis Systems.

[96]  Paul D. Gader,et al.  Neural networks with enhanced outlier rejection ability for off-line handwritten word recognition , 2002, Pattern Recognit..

[97]  Kenneth M. Sayre,et al.  Machine recognition of handwritten words: A project report , 1973, Pattern Recognit..

[98]  Venu Govindaraju,et al.  The Role of Holistic Paradigms in Handwritten Word Recognition , 2009 .

[99]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[100]  Réjean Plamondon,et al.  The generation of handwriting with delta-lognormal synergies , 1998, Biological Cybernetics.

[101]  Louis Vuurpijl,et al.  Support vector machines for the classification of western handwritten capitals , 2004 .

[102]  Josef Kittler,et al.  Multiple Classifier Systems , 2004, Lecture Notes in Computer Science.

[103]  Ching Y. Suen Thinning Methodologies for Pattern Recognition , 1994 .

[104]  Gyeonghwan Kim,et al.  An architecture for handwritten text recognition systems , 1999, International Journal on Document Analysis and Recognition.

[105]  Anthony J. Robinson,et al.  An Off-Line Cursive Handwriting Recognition System , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[106]  Horst Bunke,et al.  Handbook of Character Recognition and Document Image Analysis , 1997 .

[107]  Karl Sims,et al.  Handwritten Character Classification Using Nearest Neighbor in Large Databases , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[108]  Jean-Cédric Chappelier,et al.  Parsing N-best lists of handwritten sentences , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[109]  David G. Stork,et al.  Pattern Classification , 1973 .

[110]  Ching Y. Suen,et al.  Computer recognition of unconstrained handwritten numerals , 1992, Proc. IEEE.

[111]  Jonathan J. Hull,et al.  A hidden Markov model for language syntax in text recognition , 1992, Proceedings., 11th IAPR International Conference on Pattern Recognition. Vol.II. Conference B: Pattern Recognition Methodology and Systems.

[112]  Seong-Whan Lee,et al.  A 2-D HMM Method for Offline Handwritten Character Recognition , 2001, Int. J. Pattern Recognit. Artif. Intell..

[113]  Horst Bunke,et al.  The IAM-database: an English sentence database for offline handwriting recognition , 2002, International Journal on Document Analysis and Recognition.

[114]  Torsten Caesar,et al.  Sophisticated topology of hidden Markov models for cursive script recognition , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[115]  Guy Lorette Handwriting recognition or reading? What is the situation at the dawn of the 3rd millenium? , 1999, International Journal on Document Analysis and Recognition.

[116]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[117]  Horst Bunke,et al.  Use of Positional Information in Sequence Alignment for Multiple Classifier Combination , 2001, Multiple Classifier Systems.

[118]  George Saon Cursive word recognition using a random field based hidden Markov model , 1999, International Journal on Document Analysis and Recognition.

[119]  Minoru Mori,et al.  GENERATING NEW SAMPLES FROM HANDWRITTEN NUMERALS BASED ON POINT CORRESPONDENCE , 2004 .

[120]  Frederick Jelinek,et al.  Statistical methods for speech recognition , 1997 .

[121]  Mohamed Cheriet,et al.  A framework of combining numeric string recognizers , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[122]  Lawrence O'Gorman,et al.  Document Image Analysis Systems - Guest Editors' Introduction to the Special Issue , 1992, Computer.

[123]  Roy Huber,et al.  Handwriting Identification: Facts and Fundamentals , 1999 .

[124]  Jürgen Franke ISOLATED HANDPRINTED DIGIT RECOGNITION , 1997 .

[125]  Hiromitsu Yamada,et al.  Optical Character Recognition , 1999 .

[126]  Hiroshi Sako,et al.  Integrated segmentation and recognition of handwritten numerals: comparison of classification algorithms , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[127]  Horst Bunke,et al.  Automatic bankcheck processing , 1997 .

[128]  Ching Y. Suen,et al.  A Method of Combining Multiple Experts for the Recognition of Unconstrained Handwritten Numerals , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[129]  Nikos Fakotakis,et al.  An unconstrained handwriting recognition system , 2002, International Journal on Document Analysis and Recognition.

[130]  Torsten Caesar,et al.  Preprocessing and feature extraction for a handwriting recognition system , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[131]  Horst Bunke,et al.  Generation of synthetic training data for an HMM-based handwriting recognition system , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[132]  Gerhard Rigoll,et al.  Combination of multiple classifiers for handwritten word recognition , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[133]  Horst Bunke,et al.  Ensembles of classifiers for handwritten word recognition , 2003, Document Analysis and Recognition.