Secure Arabic Handwritten CAPTCHA Generation Using OCR Operations

Handwritten CAPTCHAs can be generated from pre-written or synthesized words, with added distortions and noise to survive OCR attacks. This paper takes a different approach for generating CAPTCHAs: use OCR operations themselves to secure the CAPTCHAs. Therefore, we utilize a number of operations found in many handwriting recognition systems (like, segmentation, baseline detection, etc.) to distort a pre-written word image itself, so that breaking the resulting CAPTCHA becomes more difficult. These OCR operations are in addition to the global image distortions that are generally done on the CAPTCHAs. The proposed method is reported for Arabic handwritten words as the cursive script of Arabic allows various OCR operations on it. To the best of our knowledge, this work is the first to generate Arabic handwritten CAPTCHAs. We evaluate our method on KHATT database of offline Arabic handwritten text. In terms of usability, we have achieved 88% to 90% accuracy. Security evaluation is done using holistic word recognition with accuracy less than 0.5%. Lexicon based attack is made difficult by working at Arabic sub-word level and then randomly selecting sub-words to build a CAPTCHA.

[1]  Venu Govindaraju,et al.  Generation and use of handwritten CAPTCHAs , 2010, International Journal on Document Analysis and Recognition (IJDAR).

[2]  Venu Govindaraju,et al.  Visual CAPTCHA with Handwritten Image Analysis , 2005, HIP.

[3]  Quintin Gee,et al.  Implementation Challenges for Nastaliq Character Recognition , 2008, IMTIC.

[4]  Adrian Rusu,et al.  Leveraging Cognitive Factors in Securing WWW with CAPTCHA , 2010, WebApps.

[5]  Mary Czerwinski,et al.  Designing human friendly human interaction proofs (HIPs) , 2005, CHI.

[6]  Mohammad Alshayeb,et al.  KHATT: An open Arabic offline handwritten text database , 2014, Pattern Recognit..

[7]  Mohammad Hassan Shirali-Shahreza,et al.  Persian/Arabic Baffletext CAPTCHA , 2006, J. Univers. Comput. Sci..

[8]  O. Tzeng,et al.  Perception of print : reading research in experimental psychology , 2017 .

[9]  Nipur Singh,et al.  Random Handwritten CAPTCHA: Web Security with a Difference , 2012 .

[10]  M.H. Shirali-Shahreza,et al.  Multilingual CAPTCHA , 2007, 2007 IEEE International Conference on Computational Cybernetics.

[11]  Amalia I. Rusu,et al.  Securing the Web Using Human Perception and Visual Object Interpretation , 2009, 2009 13th International Conference Information Visualisation.

[12]  Xianglong Tang,et al.  A new algorithm for machine printed Arabic character segmentation , 2004, Pattern Recognit. Lett..

[13]  Venu Govindaraju,et al.  Handwritten CAPTCHA: using the difference in the abilities of humans and machines in reading handwritten words , 2004, Ninth International Workshop on Frontiers in Handwriting Recognition.

[14]  Khairuddin Omar,et al.  A comparative study between methods of Arabic baseline detection , 2009, 2009 International Conference on Electrical Engineering and Informatics.

[15]  V. Govindaraju,et al.  Generation and Performance Evaluation of Synthetic Handwritten , 2008 .

[16]  Suliman A. Alsuhibany Optimising CAPTCHA Generation , 2011, 2011 Sixth International Conference on Availability, Reliability and Security.

[17]  Venu Govindaraju,et al.  On the challenges that handwritten text images pose to computers and new practical applications , 2005, IS&T/SPIE Electronic Imaging.

[18]  Muhammad Khurram Khan,et al.  Cyber security using arabic captcha scheme , 2013, Int. Arab J. Inf. Technol..

[19]  Muhammad Khurram Khan,et al.  Using Arabic CAPTCHA for Cyber Security , 2010, FGIT-SecTech/DRBC.

[20]  Venu Govindaraju,et al.  A human interactive proof algorithm using handwriting recognition , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[21]  John Langford,et al.  Telling humans and computers apart automatically , 2004, CACM.

[22]  Venu Govindaraju,et al.  A sigma-lognormal model for character level CAPTCHA generation , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[23]  Sabri A. Mahmoud,et al.  Polygonal approximation of digital planar curves through adaptive optimizations , 2010, Pattern Recognit. Lett..

[24]  Venu Govindaraju,et al.  Synthetic handwritten CAPTCHAs , 2009, Pattern Recognit..