论文信息 - On Devanagari document processing

On Devanagari document processing

Devnagari document processing system discussed here makes use of various knowledge sources at all levels. Extraction of test zone from a document is a preprocessing stage which uses document layout knowledge represented syntactically. The test zone is then segmented into lines, lines into words and words into characters. Since Devnagari characters is a complex composition of symbols, various algorithms are used to further segment the character into its constituent symbols instead of treating the character as a unit. The symbols are then recognized using various features which are extracted and saved during training phase. The recognized symbols are composed back and sent for validation through a partitioned dictionary.

R.M.K. Sinha | Veena Bansal

[1] R. Mahesh K. Sinha,et al. Rule based contextual post-processing for devanagari text recognition , 1987, Pattern Recognit..

[2] Gilles F. Houle,et al. Hybrid Contextural Text Recognition with String Matching , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[3] Ming-Kuei Hu,et al. Visual pattern recognition by moment invariants , 1962, IRE Trans. Inf. Theory.

[4] Sargur N. Srihari,et al. Integrating diverse knowledge sources in text recognition , 1982, TOIS.

[5] Theodosios Pavlidis,et al. On the Recognition of Printed Characters of Any Font and Size , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6] R. Mahesh K. Sinha,et al. Visual text recognition through contextual processing , 1988, Pattern Recognit..

[7] R. Mahesh K. Sinha,et al. On partitioning a dictionary for visual text recognition , 1990, Pattern Recognit..