A Statistical Correction-Rejection Strategy for OCR Outputs in Persian Personal Information Forms

In this paper, a MAP statistical modeling * approach has been utilized to correct and verify Persian names and surname OCR outputs. In addition, an efficient Neural Network based rejection method has been presented and tested. Due to large variety of Persian surnames, a statistical grammar has been added to the MAP strategy, to make new surnames, which are not included in the dictionary. The model has been analytically formulated and practically implemented. The achieved results show a large character and word error reduction while the calculation increase is negligible in comparison with character recognition complexity.