Foreground-background segmentation of optical character recognition labels by a single-layer recurrent neural network

Abstract. Optical character recognition (OCR) algorithms typicallystart from a binary label image. The need for a binary image iscomplicated by the fact that most imaging devices usually producemultiply valued data: a grey scale image. The problem then be-comes how to extract the meaningful character data from the greyscale image. Image artifacts such as dirt, variations in backgroundintensity, and imaging noise complicate the character extraction.When inspecting packages moving on a conveyor belt, we havecontrol over the optical parameters of the system. Via autofocus andcontrolled lighting, parameters such as the optical path length, fieldof view, and illumination intensity may be adjusted. However nocontrol can be placed on labels. The label reading system is totallysubject to the package sender’s whimsy. We describe the develop-ment of a recurrent neural network to segment grey scale label im-ages into binary label images. To determine a pixel label, the neuralnetwork takes into account three sources of information: pixel inten-sities, correlations between neighboring labels, and edge gradients.These three sources of information are succinctly combined via thenetwork’s energy function. By changing its label state to minimizethe energy function, the network satisfies constraints imposed bythe input image and the current label values. The network has noknowledge of shape. Information on what comprises a desirableshape is probably unwarranted at the earliest stage of image pro-cessing. Although significant image filtering could be performed by anetwork that knows what characters should look like, such knowl-edge is unavoidably font specific. Further there is the problem ofteaching the network about shapes. The neural network does notneed to be taught. Learning is typically extremely time consuming.To be mappable to analog hardware, it is desirable that the neuralequations be deterministic. Two deterministic networks are devel-oped and compared. The first operates at the zero temperature limit,the original Hopfield network. The second employs the mean fieldannealing algorithm. It is shown that with only a moderate increasein computational requirements, the mean field approach producesfar superior results.

[1]  Sun-Yuan Kung,et al.  Digital neural networks , 1993, Prentice Hall Information and System Sciences Series.

[2]  Anders Krogh,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[3]  A. Lumsdaine,et al.  Nonlinear analog networks for image smoothing and segmentation , 1990, IEEE International Symposium on Circuits and Systems.

[4]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Wesley E. Snyder,et al.  Optimization by Mean Field Annealing , 1988, NIPS.

[6]  Haluk Derin,et al.  Modeling and Segmentation of Noisy and Textured Images Using Gibbs Random Fields , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Paul B. Chou,et al.  On Relaxation Algorithms Based on Markov Random Fields , 1987 .

[8]  N. M. Nasrabadi,et al.  Object recognition based on graph matching implemented by a Hopfield-style neural network , 1989, International 1989 Joint Conference on Neural Networks.

[9]  J J Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Steven W. Zucker,et al.  On the Foundations of Relaxation Labeling Processes , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  John W. Woods,et al.  Two-dimensional discrete Markovian fields , 1972, IEEE Trans. Inf. Theory.

[12]  J. Brant Arseneau,et al.  VLSI and neural systems , 1990 .

[13]  Radu Horaud,et al.  Figure-Ground Discrimination: A Combinatorial Optimization Approach , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Geoffrey E. Hinton,et al.  Separating Figure from Ground with a Parallel Network , 1986, Perception.

[15]  Athanasios Papoulis,et al.  Probability, Random Variables and Stochastic Processes , 1965 .

[16]  Terrence J. Sejnowski,et al.  The Computational Brain , 1996, Artif. Intell..

[17]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[18]  T Poggio,et al.  Parallel integration of vision modules. , 1988, Science.

[19]  Federico Girosi,et al.  Parallel and Deterministic Algorithms from MRFs: Surface Reconstruction , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  D. Signorini,et al.  Neural networks , 1995, The Lancet.

[21]  C Koch,et al.  Analog "neuronal" networks in early vision. , 1986, Proceedings of the National Academy of Sciences of the United States of America.

[22]  B. K. Jenkins,et al.  Image restoration using a neural network , 1988, IEEE Trans. Acoust. Speech Signal Process..