Internet services designed for human use are being abused by programs. We present a defense against such attacks in the form of a CAPTCHA (Completely Automatic Public Turing test to tell Computers and Humans Apart) that exploits the difference in ability between humans and machines in reading images of text. CAPTCHAs are a special case of 'human interactive proofs,' a broad class of security protocols that allow people to identify themselves over networks as members of given groups. We point out vulnerabilities of reading-based CAPTCHAs to dictionary and computer-vision attacks. We also draw on the literature on the psychophysics of human reading, which suggests fresh defenses available to CAPTCHAs. Motivated by these considerations, we propose BaffleText, a CAPTCHA which uses non-English pronounceable words to defend against dictionary attacks, and Gestalt-motivated image-masking degradations to defend against image restoration attacks. Experiments on human subjects confirm the human legibility and user acceptance of BaffleText images. We have found an image-complexity measure that correlates well with user acceptance and assists in engineering the generation of challenges to fit the ability gap. Recent computer-vision attacks, run independently by Mori and Jitendra, suggest that BaffleText is stronger than two existing CAPTCHAs.
[1]
G. Legge,et al.
Psychophysics of reading—I. Normal vision
,
1985,
Vision Research.
[2]
J Grainger,et al.
Neighborhood frequency effects in visual word recognition: A comparison of lexical decision and masked identification latencies
,
1990,
Perception & psychophysics.
[3]
Henry S. Baird,et al.
Document image defect models
,
1995
.
[4]
Large-Scale Simulation Studies in Image Pattern Recognition
,
1997,
IEEE Trans. Pattern Anal. Mach. Intell..
[5]
A. Lawrence Spitz.
Moby Dick meets GEOCR: lexical considerations in word recognition
,
1997,
Proceedings of the Fourth International Conference on Document Analysis and Recognition.
[6]
Justin K. Romberg,et al.
Adding linguistic constraints to document image decoding: comparing the iterated complete path and stack algorithms
,
2000,
IS&T/SPIE Electronic Imaging.
[7]
Manuel Blum,et al.
Secure Human Identification Protocols
,
2001,
ASIACRYPT.
[8]
Kris Popat,et al.
Human Interactive Proofs and Document Image Analysis
,
2002,
Document Analysis Systems.
[9]
Henry S. Baird,et al.
PessimalPrint: a reverse Turing test
,
2001,
Proceedings of Sixth International Conference on Document Analysis and Recognition.