BaffleText: a Human Interactive Proof

Internet services designed for human use are being abused by programs. We present a defense against such attacks in the form of a CAPTCHA (Completely Automatic Public Turing test to tell Computers and Humans Apart) that exploits the difference in ability between humans and machines in reading images of text. CAPTCHAs are a special case of 'human interactive proofs,' a broad class of security protocols that allow people to identify themselves over networks as members of given groups. We point out vulnerabilities of reading-based CAPTCHAs to dictionary and computer-vision attacks. We also draw on the literature on the psychophysics of human reading, which suggests fresh defenses available to CAPTCHAs. Motivated by these considerations, we propose BaffleText, a CAPTCHA which uses non-English pronounceable words to defend against dictionary attacks, and Gestalt-motivated image-masking degradations to defend against image restoration attacks. Experiments on human subjects confirm the human legibility and user acceptance of BaffleText images. We have found an image-complexity measure that correlates well with user acceptance and assists in engineering the generation of challenges to fit the ability gap. Recent computer-vision attacks, run independently by Mori and Jitendra, suggest that BaffleText is stronger than two existing CAPTCHAs.

[1]  G. Legge,et al.  Psychophysics of reading—I. Normal vision , 1985, Vision Research.

[2]  J Grainger,et al.  Neighborhood frequency effects in visual word recognition: A comparison of lexical decision and masked identification latencies , 1990, Perception & psychophysics.

[3]  Henry S. Baird,et al.  Document image defect models , 1995 .

[4]  Large-Scale Simulation Studies in Image Pattern Recognition , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  A. Lawrence Spitz Moby Dick meets GEOCR: lexical considerations in word recognition , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[6]  Justin K. Romberg,et al.  Adding linguistic constraints to document image decoding: comparing the iterated complete path and stack algorithms , 2000, IS&T/SPIE Electronic Imaging.

[7]  Manuel Blum,et al.  Secure Human Identification Protocols , 2001, ASIACRYPT.

[8]  Kris Popat,et al.  Human Interactive Proofs and Document Image Analysis , 2002, Document Analysis Systems.

[9]  Henry S. Baird,et al.  PessimalPrint: a reverse Turing test , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.