Generation and use of handwritten CAPTCHAs

Automated recognition of unconstrained handwriting continues to be a challenging research task. In contrast to the traditional role of handwriting recognition in applications such as postal automation and bank check reading, in this paper, we explore the use of handwriting recognition in designing CAPTCHAs for cyber security. CAPTCHAs (Completely Automatic Public Turing tests to tell Computers and Humans Apart) are automatic reverse Turing tests designed so that virtually all humans can pass the test, but state-of-the-art computer programs will fail. Machine-printed, text-based CAPTCHAs are now commonly used to defend against bot attacks. Our focus is on exploring the generation and use of handwritten CAPTCHAs. We have used a large repository of handwritten word images that current handwriting recognizers cannot read (even when provided with a lexicon) for this purpose and also used synthetic handwritten samples. We take advantage of both our knowledge of the common source of errors in automated handwriting recognition systems as well as the salient aspects of human reading. The simultaneous interplay of several Gestalt laws of perception and the geon theory of pattern recognition (that implies object recognition occurs by components) allows us to explore the parameters that truly separate human and machine abilities.

[1]  Masaki Nakagawa,et al.  The state of the art in Japanese online handwriting recognition compared to techniques in western handwriting recognition , 2003, Document Analysis and Recognition.

[2]  O. Reiser,et al.  Principles Of Gestalt Psychology , 1936 .

[3]  Henry S. Baird,et al.  ScatterType: a legible but hard-to-segment CAPTCHA , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[4]  Venu Govindaraju,et al.  Use of Lexicon Density in Evaluating Word Recognizers , 2000, Multiple Classifier Systems.

[5]  Sargur N. Srihari,et al.  On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Sargur N. Srihari,et al.  Integration of hand-written address interpretation technology into the United States Postal Service Remote Computer Reader system , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[7]  Venu Govindaraju,et al.  A Stochastic Model Combining Discrete Symbols and Continuous Attributes and Its Application to Handwriting Recognition , 2002, Document Analysis Systems.

[8]  Horst Bunke,et al.  Automatic bankcheck processing , 1997 .

[9]  Venu Govindaraju,et al.  On the Dependence of Handwritten Word Recognizers on Lexicons , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Venu Govindaraju,et al.  Reading handwritten US census forms , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[11]  G. Saon,et al.  Off-line handwritten word recognition using a mixed HMM-MRF approach , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[12]  Venu Govindaraju,et al.  The Role of Holistic Paradigms in Handwritten Word Recognition , 2009 .

[13]  Henry S. Baird,et al.  BaffleText: a Human Interactive Proof , 2003, IS&T/SPIE Electronic Imaging.

[14]  Masaki Nakagawa,et al.  'Online recognition of Chinese characters: the state-of-the-art , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  I. Biederman,et al.  Recognizing depth-rotated objects: Evidence and conditions for three-dimensional viewpoint invariance. , 1993 .

[16]  Kris Popat,et al.  Human Interactive Proofs and Document Image Analysis , 2002, Document Analysis Systems.

[17]  Venu Govindaraju,et al.  The Influence of Image Complexity on Handwriting Recognition , 2006 .

[18]  Anthony J. Robinson,et al.  An Off-Line Cursive Handwriting Recognition System , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Fumitaka Kimura,et al.  Handwritten word recognition using lexicon free and lexicon directed word recognition algorithms , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[20]  George Nagy,et al.  Optical character recognition: an illustrated guide to the frontier , 1999, Electronic Imaging.

[21]  Seong-Whan Lee,et al.  An HMMRF-based statistical approach for off-line handwritten character recognition , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[22]  Patrice Y. Simard,et al.  Using Machine Learning to Break Visual Human Interaction Proofs (HIPs) , 2004, NIPS.

[23]  Sargur N. Srihari,et al.  Decision Combination in Multiple Classifier Systems , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Gyeonghwan Kim,et al.  A Lexicon Driven Approach to Handwritten Word Recognition for Real-Time Applications , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  S. Petersen,et al.  Activation of extrastriate and frontal cortical areas by visual words and word-like stimuli. , 1990, Science.

[26]  Sargur N. Srihari,et al.  A hypothesis testing approach to word recognition using dynamic feature selection , 1992, Proceedings., 11th IAPR International Conference on Pattern Recognition. Vol.II. Conference B: Pattern Recognition Methodology and Systems.

[27]  I. Biederman Recognition-by-components: a theory of human image understanding. , 1987, Psychological review.