Towards a distributed Arabic OCR based on the DTW algorithm: performance analysis

In spite of the diversity of printed Arabic optical character recognition products and proposals, the problem seems to be not yet well solved. The complex morphology and calligraphy of the Arabic writing on one hand and the use of some light approaches on the other hand are behind the poorness of these products. However, some strong proposed approaches didn’t find the opportunity to be commercialised because of generally their corresponding complex computing. The dynamic time warping algorithm is considered as one among these strong approaches. In fact, several studies and experiments have shown and confirmed that the printed Arabic optical character recognition based on dynamic time warping algorithm provides a very interesting recognition rate especially for large and huge vocabularies. One of the attractive sides of the dynamic time warping algorithm is its ability to recognize properly connected or cursive characters (words or sub words) without prior segmentation. Furthermore, this algorithm performs the recognition process from within a reference library of isolated characters and owns a very good immunity against noises. Unfortunately, the big amount of its computing during the recognition process makes its execution time very slow and, hence, restricts its utilization. Many researchers attempted to speedup the execution time of this algorithm. Unfortunately, the corresponding proposed solutions require generally specific high cost architectures. Loosely coupled architectures such as grapes or grid computing can provide enough power without additional cost to distribute the complexity of some greedy applications. Consequently, we report in this paper the performance analysis of an analytical and an experimental study of a distributed Arabic optical character recognition based on the dynamic time warping algorithm within loosely coupled architectures. Obtained results confirm that loosely coupled architectures and more specifically grid computing present a very interesting framework to speedup the Arabic optical character recognition based on the dynamic time warping algorithm.

[1]  Minjie Zhang,et al.  Agent-Based Grid Computing , 2008, Computational Intelligence: A Compendium.

[2]  Erkki Oja,et al.  Experiments with adaptation strategies for a prototype-based recognition system for isolated handwritten characters , 2001, International Journal on Document Analysis and Recognition.

[3]  Abdelfettah Belghith,et al.  A multipurpose multi-agent system based on a loosely coupled architecture to speedup the DTW algorithm for Arabic printed cursive OCR , 2005, The 3rd ACS/IEEE International Conference onComputer Systems and Applications, 2005..

[4]  Steven Tuecke,et al.  The Anatomy of the Grid , 2003 .

[5]  Neil W. Bergmann,et al.  An Arabic optical character recognition system using recognition-based segmentation , 2001, Pattern Recognit..

[6]  C. V. Jawahar,et al.  Model-Based Annotation of Online Handwritten Datasets , 2006 .

[7]  Adnan Amin,et al.  Off-line Arabic character recognition: the state of the art , 1998, Pattern Recognit..

[8]  S. Kanoun.,et al.  Reconnaissance d'images de textes arabes par approche affixale , 2004 .

[9]  Jean-Luc Gauvain,et al.  A dynamic programming processor for speech recognition , 1989 .

[10]  King-Sun Fu,et al.  VLSI architecture for dynamic time-warp recognition of handwritten symbols , 1986, IEEE Trans. Acoust. Speech Signal Process..

[11]  Jonathan Armstrong,et al.  Introduction to grid computing with globus , 2003 .

[12]  Michael D. Brown,et al.  An algorithm for connected word recognition , 1982, ICASSP.

[13]  R. Bellman Dynamic programming. , 1957, Science.

[14]  Edson Cáceres,et al.  Parallel dynamic programming for solving the string editing problem on a CGM/BSP , 2002, SPAA '02.

[15]  Raúl Rojas,et al.  A survey on recognition of on-line handwritten mathematical notation , 2007 .

[16]  Maher Khemakhem Reconnaissance de caractères imprimés par comparaison dynamique , 1987 .