Security and Usability Challenges of Moving-Object CAPTCHAs: Decoding Codewords in Motion

We explore the robustness and usability of moving-image object recognition (video) captchas, designing and implementing automated attacks based on computer vision techniques. Our approach is suitable for broad classes of moving-image captchas involving rigid objects. We first present an attack that defeats instances of such a captcha (NuCaptcha) representing the state-of-the-art, involving dynamic text strings called codewords. We then consider design modifications to mitigate the attacks (e.g., overlapping characters more closely). We implement the modified captchas and test if designs modified for greater robustness maintain usability. Our lab-based studies show that the modified captchas fail to offer viable usability, even when the captcha strength is reduced below acceptable targets--signaling that the modified designs are not viable. We also implement and test another variant of moving text strings using the known emerging images idea. This variant is resilient to our attacks and also offers similar usability to commercially available approaches. We explain why fundamental elements of the emerging images concept resist our current attack where others fails.

[1]  John C. Mitchell,et al.  The Failure of Noise-Based Non-continuous Audio Captchas , 2011, 2011 IEEE Symposium on Security and Privacy.

[2]  A. Torralba,et al.  The role of context in object recognition , 2007, Trends in Cognitive Sciences.

[3]  Andrea Vedaldi,et al.  Vlfeat: an open and portable library of computer vision algorithms , 2010, ACM Multimedia.

[4]  R. Weale Vision. A Computational Investigation Into the Human Representation and Processing of Visual Information. David Marr , 1983 .

[5]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[6]  John C. Mitchell,et al.  How Good Are Humans at Solving CAPTCHAs? A Large Scale Evaluation , 2010, 2010 IEEE Symposium on Security and Privacy.

[7]  Jeff Yan,et al.  A low-cost attack on a Microsoft captcha , 2008, CCS.

[8]  David D. Cox,et al.  Opinion TRENDS in Cognitive Sciences Vol.11 No.8 Untangling invariant object recognition , 2022 .

[9]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[10]  Jon Driver,et al.  Edge-Assignment and Figure–Ground Segmentation in Short-Term Visual Matching , 1996, Cognitive Psychology.

[11]  Yang Peng,et al.  A 3-layer Dynamic CAPTCHA Implementation , 2010, 2010 Second International Workshop on Education Technology and Computer Science.

[12]  John Langford,et al.  Telling humans and computers apart automatically , 2004, CACM.

[13]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[14]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[15]  R. Lowry,et al.  Concepts and Applications of Inferential Statistics , 2014 .

[16]  Wen-Hung Liao,et al.  Embedding information within dynamic visual patterns , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[17]  Leyla Bilge,et al.  CAPTCHA smuggling: hijacking web browsing sessions to create CAPTCHA farms , 2010, SAC '10.

[18]  Shimon Ullman,et al.  Computational Studies in the Interpretation of Structure and Motion: Summary and Extension , 1983 .

[19]  S. Thorpe,et al.  Speed of processing in the human visual system , 1996, Nature.

[20]  Mary Czerwinski,et al.  Building Segmentation Based Human-Friendly Human Interaction Proofs (HIPs) , 2005, HIP.

[21]  Xia Wang,et al.  A CAPTCHA Implementation Based on 3D Animation , 2009, 2009 International Conference on Multimedia Information Networking and Security.

[22]  Jon Howell,et al.  Asirra: a CAPTCHA that exploits interest-aligned manual image categorization , 2007, CCS '07.

[23]  Xia Wang,et al.  A CAPTCHA Implementation Based on Moving Objects Recognition Problem , 2010, 2010 International Conference on E-Business and E-Government.

[24]  Wang Xia A 3-layer Dynamic CAPTCHA Implementation , 2010 .

[25]  Marc Pollefeys,et al.  Articulated Motion Segmentation Using RANSAC with Priors , 2006, WDV.

[26]  Mary Czerwinski,et al.  Designing human friendly human interaction proofs (HIPs) , 2005, CHI.

[27]  E. Rolls High-level vision: Object recognition and visual cognition, Shimon Ullman. MIT Press, Bradford (1996), ISBN 0 262 21013 4 , 1997 .

[28]  J. Yan,et al.  Captcha Robustness: A Security Engineering Perspective , 2011, Computer.

[29]  N. Kanwisher,et al.  PSYCHOLOGICAL SCIENCE Research Article Visual Recognition As Soon as You Know It Is There, You Know What It Is , 2022 .

[30]  Daniel Cohen-Or,et al.  Emerging images , 2009, SIGGRAPH 2009.

[31]  Jitendra Malik,et al.  Recognizing objects in adversarial clutter: breaking a visual CAPTCHA , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[32]  Dimitris Gritzalis,et al.  Audio CAPTCHA: Existing solutions assessment and a new implementation for VoIP telephony , 2010, Comput. Secur..

[33]  John C. Mitchell,et al.  Text-based CAPTCHA strengths and weaknesses , 2011, CCS '11.

[34]  Steven Bethard,et al.  Decaptcha: Breaking 75% of eBay Audio CAPTCHAs , 2009, WOOT.

[35]  Patrice Y. Simard,et al.  Best practices for convolutional neural networks applied to visual document analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[36]  Jeff Yan,et al.  Usability of CAPTCHAs or usability issues in CAPTCHA design , 2008, SOUPS '08.

[37]  M. Shirali-Shahreza,et al.  Motion CAPTCHA , 2008, 2008 Conference on Human System Interactions.

[38]  Francesco Bergadano,et al.  Anti-bot Strategies Based on Human Interactive Proofs , 2010, Handbook of Information and Communication Security.

[39]  Philippe Golle,et al.  Machine learning attacks against the Asirra CAPTCHA , 2008, CCS.

[40]  Stuart J. Russell,et al.  Image Segmentation in Video Sequences: A Probabilistic Approach , 1997, UAI.

[41]  S. Ullman High-Level Vision: Object Recognition and Visual Cognition , 1996 .

[42]  Harry Hochheiser,et al.  Research Methods for Human-Computer Interaction , 2008 .

[43]  Chris Kanich,et al.  Re: CAPTCHAs-Understanding CAPTCHA-Solving Services in an Economic Context , 2010, USENIX Security Symposium.

[44]  Jeff Yan,et al.  Breaking Visual CAPTCHAs with Naive Pattern Recognition Algorithms , 2007, Twenty-Third Annual Computer Security Applications Conference (ACSAC 2007).

[45]  John Langford,et al.  CAPTCHA: Using Hard AI Problems for Security , 2003, EUROCRYPT.

[46]  D Marr,et al.  A computational theory of human stereo vision. , 1979, Proceedings of the Royal Society of London. Series B, Biological sciences.

[47]  Siddheswar Ray,et al.  Determination of Number of Clusters in K-Means Clustering and Application in Colour Image Segmentation , 2000 .

[48]  Chao Yang,et al.  Attacks and design of image recognition CAPTCHAs , 2010, CCS '10.

[49]  Moni Naor,et al.  VERI CATION OF A HUMAN IN THE LOOP OR IDENTI CATION VIA THE TURING TEST , 1996 .

[50]  Gonzalo Álvarez,et al.  CAPTCHAs: An Artificial Intelligence Application to Web Security , 2011, Adv. Comput..

[51]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[52]  Richard Zanibbi,et al.  Balancing usability and security in a video CAPTCHA , 2009, SOUPS.