Feature Extraction Methods for Character Recognition

Feature extraction is an important stage in the recognition of characters particularly for handprinted characters. Based on a study of the extensive literature in handwritten character recognition, this review paper describes the commonly used feature extraction techniques using global features, distribution of points, geometrical and topological features, linguistic descriptions, use of contexts and fuzzy sets. The preliminary considerations like pre-processing operations, i.e., isolation, smoothing, thinning, normalization and centering are discussed first. Features of different characters like numerals, alphanumeric, FORTRAN, Katakana, Kanji and characters of some Indian languages (Devanagri, Bengali, Telugu and Tamil) are given. In the end, the scope of future research in the area is indicated.

[1]  Robert A. Wilson Optical Page Reading Devices , 1966 .

[2]  L. D. Harmon,et al.  Automatic recognition of print and script , 1972 .

[3]  Y. H. Huh,et al.  On-line recognition of hand-printed Korean characters , 1982, Pattern Recognit..

[4]  David J. Quarmby,et al.  Experiments on Handwritten Numeral Classification , 1971, IEEE Trans. Syst. Man Cybern..

[5]  Sadakazu Watanabe,et al.  Microprogram controlled pattern processing in a handwritten mail reader-sorter , 1970, Pattern Recognit..

[6]  Murray Eden,et al.  Handwriting and pattern recognition , 1962, IRE Trans. Inf. Theory.

[7]  R. Ledley Use of computers in biology and medicine , 1965 .

[8]  Ching Y. Suen,et al.  Distinctive features in automatic recognition of handprinted characters , 1982 .

[9]  Yi-Tzuu Chien,et al.  A New Data Base for Syntax-Directed Pattern Analysis and Recognition , 1972, IEEE Transactions on Computers.

[10]  C. N. Liu A Programmed Algorithm for Designing Multifont Character Recognition Logics , 1964, IEEE Trans. Electron. Comput..

[11]  Julius T. Tou,et al.  Recognition of Handwritten Characters by Topological Feature Extraction and Multilevel Categorization , 1972, IEEE Transactions on Computers.

[12]  B. N. Chatterji,et al.  Recognition of Distorted Kannada Characters , 1984 .

[13]  D.W.C. Shen,et al.  Character recognition by context-dependent transformations , 1964 .

[14]  R.M.K. Sinha Computer Processing of Indian Languages and Scripts—Potentialities & Problems , 1984 .

[15]  D. Dutta Majumder,et al.  On Some Contributions in Computer Technology and Imformation Sciences , 1983 .

[16]  H. Sherman,et al.  A quasi-topological method for the recognition of line patterns , 1959, IFIP Congress.

[17]  Thomas Marill,et al.  On the effectiveness of receptors in recognition systems , 1963, IEEE Trans. Inf. Theory.

[18]  V. A. Kovalevsky,et al.  Character readers and pattern recognition , 1968 .

[19]  Hiroshi Yoshida,et al.  An on-line character recognition aimed at a substitution for a billing machine keyboard , 1976, Pattern Recognit..

[20]  Azriel Rosenfeld,et al.  Some Parallel Thinning Algorithms for Digital Pictures , 1971, JACM.

[21]  M Berthod,et al.  Learning in syntactic recognition of symbols drawn on a graphic tablet , 1979 .

[22]  Azriel Rosenfeld,et al.  A Grammar for Maps , 1971 .

[23]  R. Narasimhan,et al.  Syntax-directed interpretation of classes of pictures , 1966, CACM.

[24]  Torahiko Sugiura,et al.  A method for the recognition of Japanese hiragana characters , 1968, IEEE Trans. Inf. Theory.

[25]  Sverker Hård,et al.  Character recognition by complex filtering in reading machines , 1973, Pattern Recognit..

[26]  Matthew Lybanon,et al.  Recognition Of Handprinted Characters For Automated Cartography , 1979, Optics & Photonics.

[27]  N. D. Tucker,et al.  A two-step strategy for character recognition using geometrical moments , 1975 .

[28]  B. L. Deekshatulu,et al.  Recognition of Printed Telugu Characters , 1977 .

[29]  John A. McLaughlin,et al.  Nth-Order Autocorrelations in Pattern Recognition , 1968, Inf. Control..

[30]  Douglas J. H. Moore,et al.  An Approach to the Analysis and Extraction of Pattern Features Using Integral Geometry , 1972, IEEE Trans. Syst. Man Cybern..

[31]  T. W. Sze,et al.  A method of recognition of hand drawn line patterns , 1967 .

[32]  Theodosios Pavlidis,et al.  Decomposition of Polygons into Simpler Components: Feature Generation for Syntactic Pattern Recognition , 1975, IEEE Transactions on Computers.

[33]  Richard M. Brown,et al.  On-Line Computer Recognition of Handprinted Characters , 1964, IEEE Trans. Electron. Comput..

[34]  Peter John Knoke A linguistic approach to mechanical pattern recognition , 1968 .

[35]  A. K. Dutta An Experimental Procedure for Handwritten Character Recognition , 1974, IEEE Transactions on Computers.

[36]  Worthie Doyle,et al.  Recognition of sloppy, hand-printed characters , 1960, IRE-AIEE-ACM '60 (Western).

[37]  Henk Koppelaar,et al.  Application of Fuzzy Set Theory to Syntactic Pattern Recognition of Handwritten Capitals , 1976, IEEE Transactions on Systems, Man, and Cybernetics.

[38]  Leonard Uhr,et al.  A pattern recognition program that generates, evaluates, and adjusts its own operators , 1961, IRE-AIEE-ACM '61 (Western).

[39]  Hewitt D. Crane,et al.  Special Feature: An On-Line Data Entry System for Hand-Printed Characters* , 1977, Computer.

[40]  Ishwar K. Sethi,et al.  Machine recognition of constrained hand printed devanagari , 1977, Pattern Recognit..

[41]  Auerbach Publishers,et al.  Auerbach on optical character recognition , 1971 .

[42]  R. Casey Moment normalization of handprinted characters , 1970 .

[43]  D. GABOR,et al.  Character Recognition by Holography , 1965, Nature.

[44]  R. Narasimhan,et al.  Labeling Schemata and Synctactic Descriptions of Pictures , 1964, Inf. Control..

[45]  W. W. Bledsoe,et al.  Pattern recognition and reading by machine , 1959, IRE-AIEE-ACM '59 (Eastern).

[46]  Arnold L. Knoll,et al.  Experiments with "Characteristic Loci" for Recognition of Handprinted Characters , 1969, IEEE Transactions on Computers.

[47]  B. N. Chatterji,et al.  Feature Extraction (Pattern Detection) Methods in Pattern Recognition , 1973 .

[48]  T. L. Dimond,et al.  Devices for reading handwritten characters , 1899, IRE-ACM-AIEE '57 (Eastern).

[49]  George Nagy,et al.  An Experimental Study of Machine Recognition of Hand-Printed Numerals , 1968, IEEE Trans. Syst. Sci. Cybern..

[50]  Shin-ichi Hanaki,et al.  On-line recognition of handprinted Kanji characters , 1980, Pattern Recognit..

[51]  G. Toussaint,et al.  Algorithms for Recognizing Contour-Traced Handprinted Characters , 1970, IEEE Transactions on Computers.

[52]  A. Güdesen Quantitative analysis of preprocessing techniques for the recognition of handprinted characters , 1976, Pattern Recognit..

[53]  Richard O. Duda,et al.  Experiments in the recognition of hand-printed text, part II: context analysis , 1968, AFIPS '68 (Fall, part II).

[54]  R. Narasimhan,et al.  A syntax-aided recognition scheme for handprinted english letters , 1971, Pattern Recognit..

[55]  Josef Raviv,et al.  Decision making in Markov chains applied to the problem of pattern recognition , 1967, IEEE Trans. Inf. Theory.

[56]  Ulric Neisser,et al.  A Note on Human Recognition of Hand-Printed Characters , 1960, Inf. Control..

[57]  Suban G. Krishnamoorthy,et al.  Recognition of handprinted Tamil characters , 1980, Pattern Recognit..

[58]  Gösta H. Granlund,et al.  Fourier Preprocessing for Hand Print Character Recognition , 1972, IEEE Transactions on Computers.

[59]  Theodosios Pavlidis,et al.  Syntactic Recognition of Handwritten Numerals , 1977, IEEE Transactions on Systems, Man, and Cybernetics.

[60]  Lotfi A. Zadeh,et al.  Similarity relations and fuzzy orderings , 1971, Inf. Sci..

[61]  Theodosios Pavlidis,et al.  A Hierarchical Syntactic Shape Analyzer , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[62]  S. H. Unger,et al.  Pattern Detection and Recognition , 1959, Proceedings of the IRE.

[63]  George Nagy,et al.  Recognition of Printed Chinese Characters , 1966, IEEE Trans. Electron. Comput..

[64]  Masamichi Shimura Multicategory Learning Classifiers for Character Reading , 1973, IEEE Trans. Syst. Man Cybern..

[65]  William C. Naylor Some Studies in the Interactive Design of Character Recognition Systems , 1971, IEEE Transactions on Computers.

[66]  George Nagy,et al.  State of the art in pattern recognition , 1968 .

[67]  Wolfgang Doster,et al.  Contextual Postprocessing System for Cooperation with a Multiple-Choice Character-Recognition System , 1977, IEEE Transactions on Computers.

[68]  M. J. Minneman Handwritten character recognition employing topology, cross correlation, and decision theory , 1966 .

[69]  Thomas W. Calvert,et al.  Nonorthogonal Projections for Feature Extraction in Pattern Recognition , 1969, IEEE Transactions on Computers.

[70]  B. N. Chatterji A Combined Fuzzy Set Theoretic and Heuristic Method for Character Recognition , 1983 .

[71]  C. C. Hsu,et al.  A Recognition Algorithm for Handprinted Arabic Numerals , 1970, IEEE Trans. Syst. Sci. Cybern..

[72]  N. Sunderesan,et al.  Application of Fuzzy Set for Recognition of Handwritten English Characters , 1982 .

[73]  Hiroki Arakawa On-line recognition of handwritten characters - alphanumerics, Hiragana, Katakana, Kanji , 1983, Pattern Recognit..

[74]  Arnold K. Griffith,et al.  Mathematical Models for Automatic Line Detection , 1971, JACM.

[75]  Wilbur H. Highleyman,et al.  An Analog Method for Character Recognition , 1961, IRE Trans. Electron. Comput..

[76]  Evon C. Greanias,et al.  The Recognition of Handwritten Numerals by Contour Analysis , 1963, IBM J. Res. Dev..

[77]  William Stallings,et al.  Approaches to chinese character recognition , 1976, Pattern Recognit..

[78]  D. B. Devoe Alternatives to Handprinting in the Manual Entry of Data , 1967 .

[79]  C. J. Tunis,et al.  Handprinting Input device for computer systems , 1967, IEEE Spectrum.

[80]  Paul P. Wang,et al.  Machine recognition of printed Chinese characters via transformation algorithms , 1973, Pattern Recognit..

[81]  Mark Michael,et al.  Experimental Study of Information Measure and Inter-Intra Class Distance Ratios on Feature Selection and Orderings , 1973, IEEE Trans. Syst. Man Cybern..

[82]  M. Chandrasekaran,et al.  Computer Recognition of Tamil, Malayalam and Devanagari Characters , 1984 .

[83]  M. Berthod,et al.  Automatic recognition of handprinted characters—The state of the art , 1980, Proceedings of the IEEE.

[84]  Chi-Hau Chen A computer searching criterion for best feature set in character recognition , 1965 .

[85]  S. Wendling,et al.  A Set of Invariants Within the Power Spectrum of Unitary Transformations , 1978, IEEE Transactions on Computers.

[86]  H J Caulfield Linear combination of filters for character recognition: a unified treatment. , 1980, Applied optics.

[87]  EDWARD M. RISEMAN,et al.  Contextual Word Recognition Using Binary Digrams , 1971, IEEE Transactions on Computers.

[88]  K. B. Gray Recognition of characters subject to spatially independent transformations , 1969, Inf. Sci..

[89]  Manfred H. Hueckel An Operator Which Locates Edges in Digitized Pictures , 1971, J. ACM.

[90]  David R. Smith,et al.  A Threshold Logic Network for Shape Invariance , 1967, IEEE Trans. Electron. Comput..

[91]  Ralph Roskies,et al.  Fourier Descriptors for Plane Closed Curves , 1972, IEEE Transactions on Computers.

[92]  T. G. Evans,et al.  CYCLOPS-1: a second-generation recognition system , 1963, AFIPS '63 (Fall).

[93]  B. Chatterjee,et al.  Design of a Nearest Neighbour Classifier System for Bengali Character Recognition , 1984 .

[94]  C. L. Coates,et al.  Machine Recognition of Handprinted Characters. , 1972 .

[95]  Robert J. Spinrad,et al.  Machine Recognition of Hand Printing , 1965, Inf. Control..

[96]  George Nagy,et al.  Feature Extraction on Binary Patterns , 1969, IEEE Trans. Syst. Sci. Cybern..

[97]  Godfried T. Toussaint,et al.  Experiments in Text Recognition with the Modified Viterbi Algorithm , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[98]  L. P. Horwitz,et al.  Pattern Recognition Using Autocorrelation , 1961, Proceedings of the IRE.

[99]  Herbert A. Glucksman Multicategory Classification of Patterns Represented by High-Order Vectors of Multilevel Measurements , 1971, IEEE Transactions on Computers.

[100]  Julian R. Ullmann,et al.  A Use of Continuity in Character Recognition , 1974, IEEE Trans. Syst. Man Cybern..

[101]  Shinichi Tamura,et al.  Pattern Classification Based on Fuzzy Relations , 1971, IEEE Trans. Syst. Man Cybern..

[102]  Wilbur H. Highleyman,et al.  Comments on a Character Recognition Method of Bledsoe and Browning , 1960, IRE Trans. Electron. Comput..

[103]  GERARD GAILLAT A simple learning decision algorithm for character recognition and pattern classification , 1978, Pattern Recognit..

[104]  King-Sun Fu,et al.  Shape Discrimination Using Fourier Descriptors , 1977, IEEE Trans. Syst. Man Cybern..

[105]  Edward S. Deutsch,et al.  Thinning algorithms on rectangular, hexagonal, and triangular arrays , 1972, Commun. ACM.

[106]  A. H. Watt,et al.  Recognition of Hand-Printed Numerals Reduced to Graph-Representable Form , 1971, IJCAI.

[107]  Keiichi Abe,et al.  An application of the hough transform to the recognition of printed hebrew characters , 1983, Pattern Recognit..

[108]  T. Hoshino,et al.  Computer-Aided Design for a Reader of Hand-Printed Characters , 1969, IJCAI.

[109]  Mary Elizabeth Stevens Introduction to the special issue on Optical Character Recognition (OCR) , 1970, Pattern Recognit..

[110]  Julian R. Ullmann,et al.  Recognition experiments with typed numerals from envelopes in the mail , 1969, Pattern Recognit..

[111]  L. D. Harmon A line-drawing pattern recognizer , 1899, IRE-AIEE-ACM '60 (Western).

[112]  J. Ullmann Picture analysis in character recognition , 1976 .

[113]  Allen R. Hanson,et al.  A Contextual Postprocessing System for Error Correction Using Binary n-Grams , 1974, IEEE Transactions on Computers.

[114]  G. SIROMONEY,et al.  Computer recognition of printed Tamil characters , 1978, Pattern Recognit..

[115]  Martin D. Levine,et al.  Feature extraction: A survey , 1969 .

[116]  H. Genchi,et al.  Recognition of handwritten numerical characters for automatic letter sorting , 1968 .

[117]  Herbert Freeman,et al.  On the Encoding of Arbitrary Geometric Configurations , 1961, IRE Trans. Electron. Comput..

[118]  King-Sun Fu,et al.  A Dynamic Programming Approach to Sequential Pattern Recognition , 1967, IEEE Trans. Electron. Comput..

[119]  Godfried T. Toussaint,et al.  Results Obtained Using a Simple Character Recognition Procedure on Munson's Handprinted Data , 1972, IEEE Transactions on Computers.

[120]  Theodosios Pavlidis,et al.  Computer Recognition of Handwritten Numerals by Polygonal Approximations , 1975, IEEE Transactions on Systems, Man, and Cybernetics.