An analytic approach for generation of artificial hand-printed character database from given generative models

A large database which incorporates every possible variations, is required for training/testing of any handwritten character recognition system. Collecting natural samples is not convenient in many cases. This necessitates generation of characters artificially in such a way that the various instances of the characters generated resemble natural samples, and the variation distribution conforms to that of human writings. In this paper, we describe an analytic approach for character generation from given generative models. A character is split into several lines and arcs. A set of equations and parameters describing these lines and arcs, completely defines the character. These parameters are specified as the average of them derived from a collection of natural samples. The parameter values are randomly modulated to create variations in the structure of the character. The pattern is then subjected to affine transform and dilation by different transform matrices and structuring elements, respectively. Thus, the size, aspect ratio, orientation, position and thickness are varied from instance to instance. A test comparing the nature of distribution of the samples generated by this approach to that of natural samples shows a close agreement between them. This validates the utility of the proposed character generation technique.