论文信息 - Breaking text-based CAPTCHAs with variable word and character orientation

Breaking text-based CAPTCHAs with variable word and character orientation

A novel approach for automatic segmentation and recognition of CAPTCHAs with variable orientation and random collapse of overlapped characters is presented in this paper. Additionally, the extension of the proposed approach to break reCAPTCHA of version of 2012 is also discussed. The original proposal consists in straightening characters and word in CAPTCHA exploiting then a three-color bar code for their segmentation. The recognition of straightened characters and whole word is provided by the proposed original SVM-based learning classifier. The main goal of this research is to reduce vulnerability of CAPTCHA from spam and frauds as well as to provide an approach for recognizing either handwritten or degraded and damaged texts in ancient manuscripts by OCR systems. The designed framework for breaking CAPTCHAs by the proposed approach has been tested achieving average segmentation success rate up to 82% for reCAPTCHA of version 2011 and achieving 95.5% by extended approach for reCAPTCHA of version 2012 with response time less than 0.5s per two-word reCAPTCHA. The implemented SVM classifier shows a competitive precision about 94%. The obtained very satisfactory results confirm that the proposed approach may be used for development of new security mechanisms to protect users against cyber-criminal activities and Internet threats. Automatic segmentation and recognition of CAPTCHAs in Web sites is proposed.Anti-recognition techniques use collapsed characters with variable orientation.Aligned word and straightened characters are segmented by three-color bar code.Original SVM-based learning classifier provides real-time CAPTCHA recognition.Extended approach for beating reCAPTCHA of version 2012 shows better performance.

Oleg Starostenko | Vicente Alarcón Aquino | Claudia Cruz-Perez | Fernando Uceda-Ponga