Text Captcha Is Dead? A Large Scale Deployment and Empirical Study

The development of deep learning techniques has significantly increased the ability of computers to recognize CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart), thus breaking or mitigating the security of existing captcha schemes. To protect against these attacks, recent works have been proposed to leverage adversarial machine learning to perturb captcha pictures. However, they either require the prior knowledge of captcha solving models or lack adaptivity to the evolving behaviors of attackers. Most importantly, none of them has been deployed in practical applications, and their practical applicability and effectiveness are unknown. In this work, we introduce advCAPTCHA, a practical adversarial captcha generation system that can defend against deep learning based captcha solvers, and deploy it on a large-scale online platform with near billion users. To the best of our knowledge, this is the first such work that has been deployed on international large-scale online platforms. By applying adversarial learning techniques in a novel manner, advCAPTCHA can generate effective adversarial captchas to significantly reduce the success rate of attackers, which has been demonstrated by a large-scale online study. Furthermore, we also validate the feasibility of advCAPTCHA in practical applications, as well as its robustness in defending against various attacks. We leverage the existing user risk analysis system to identify potential attackers and serve advCAPTCHA to them. We then use their answers as queries to the attack model. In this manner, advCAPTCHA can be adapted/fine-tuned to accommodate the attack model evolution. Overall, advCAPTCHA can serve as a key enabler for generating robust captchas in practice and providing useful guidelines for captcha developers and practitioners.

[1]  Ping Zhang,et al.  A Simple Generic Attack on Text Captchas , 2016, NDSS.

[2]  Wei Wang,et al.  The robustness of hollow CAPTCHAs , 2013, CCS.

[3]  Hovav Shacham,et al.  Pixel Perfect : Fingerprinting Canvas in HTML 5 , 2012 .

[4]  Dipankar Dasgupta,et al.  Human-Cognition-Based CAPTCHAs , 2015, IT Professional.

[5]  Ajmal Mian,et al.  Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey , 2018, IEEE Access.

[6]  Ting Wang,et al.  Towards understanding the security of modern image captchas and underground captcha-solving services , 2019, Big Data Min. Anal..

[7]  Fan Zhang,et al.  Stealing Machine Learning Models via Prediction APIs , 2016, USENIX Security Symposium.

[8]  Xiang Bai,et al.  Robust Scene Text Recognition with Automatic Rectification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Jun Zhu,et al.  Boosting Adversarial Attacks with Momentum , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10]  Jan-Michael Frahm,et al.  Security and Usability Challenges of Moving-Object CAPTCHAs: Decoding Codewords in Motion , 2012, USENIX Security Symposium.

[11]  Ting Wang,et al.  Adversarial CAPTCHAs , 2019, IEEE Transactions on Cybernetics.

[12]  Xiang Bai,et al.  An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[14]  David Wagner,et al.  Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods , 2017, AISec@CCS.

[15]  Zheng Wang,et al.  Yet Another Text Captcha Solver: A Generative Adversarial Network Based Approach , 2018, CCS.

[16]  Jeff Yan,et al.  Breaking Visual CAPTCHAs with Naive Pattern Recognition Algorithms , 2007, Twenty-Third Annual Computer Security Applications Conference (ACSAC 2007).

[17]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[18]  Logan Engstrom,et al.  Black-box Adversarial Attacks with Limited Queries and Information , 2018, ICML.

[19]  Pan He,et al.  Adversarial Examples: Attacks and Defenses for Deep Learning , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[20]  Jiri Matas,et al.  Deep TextSpotter: An End-to-End Trainable Scene Text Localization and Recognition Framework , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[21]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[22]  Chao Yang,et al.  Attacks and design of image recognition CAPTCHAs , 2010, CCS '10.

[23]  Nina Narodytska,et al.  Simple Black-Box Adversarial Perturbations for Deep Networks , 2016, ArXiv.

[24]  Jinfeng Yi,et al.  ZOO: Zeroth Order Optimization Based Black-box Attacks to Deep Neural Networks without Training Substitute Models , 2017, AISec@CCS.

[25]  Somesh Jha,et al.  Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures , 2015, CCS.

[26]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Geeng-Neng You,et al.  A Spelling Based CAPTCHA System by Using Click , 2012, 2012 International Symposium on Biometrics and Security Technologies.

[28]  Ananthram Swami,et al.  The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[29]  Stanislav S. Makhanov,et al.  Defeating line-noise CAPTCHAs with multiple quadratic snakes , 2013, Comput. Secur..

[30]  C. Martin 2015 , 2015, Les 25 ans de l’OMC: Une rétrospective en photos.

[31]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[32]  John C. Mitchell,et al.  Text-based CAPTCHA strengths and weaknesses , 2011, CCS '11.

[33]  John C. Mitchell,et al.  The End is Nigh: Generic Solving of Text-based CAPTCHAs , 2014, WOOT.

[34]  Ananthram Swami,et al.  Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[35]  Jonathan Lazar,et al.  The SoundsRight CAPTCHA: an improved approach to audio human interaction proofs for blind users , 2012, CHI.

[36]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[37]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[38]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[39]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Patrice Y. Simard,et al.  Using Machine Learning to Break Visual Human Interaction Proofs (HIPs) , 2004, NIPS.

[41]  Jeffrey F. Naughton,et al.  A Methodology for Formalizing Model-Inversion Attacks , 2016, 2016 IEEE 29th Computer Security Foundations Symposium (CSF).

[42]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[43]  Atul Prakash,et al.  Robust Physical-World Attacks on Deep Learning Visual Classification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[44]  Jeff Yan,et al.  A low-cost attack on a Microsoft captcha , 2008, CCS.

[45]  Dawn Song,et al.  Robust Physical-World Attacks on Deep Learning Models , 2017, 1707.08945.

[46]  Angelos D. Keromytis,et al.  I am Robot: (Deep) Learning to Break Semantic Image CAPTCHAs , 2016, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[47]  Julio Hernandez-Castro,et al.  No Bot Expects the DeepCAPTCHA! Introducing Immutable Adversarial Examples, With Applications to CAPTCHA Generation , 2017, IEEE Transactions on Information Forensics and Security.