论文信息 - Highly Degraded Recto-verso Document Image Processing and Understanding

Highly Degraded Recto-verso Document Image Processing and Understanding

can be interpreted as matching of an input graph (keyword) with a large set of graphs (document). More formally, in order to spot a certain keyword wi, all p graph instances gi1, ... , gip of that word wi occurring in the training set are matched against all graph words in each text line using our adapted graph matching procedure. That is, for a given word wi and a specific text line s pairwise distances between all prototypical graphs gi1, ...,gip and the m word graphs g'1, ... ,g'm from text line s are obtained first. The minimum of these graph distances serves as a distance function d(wi,s) of the keyword’s word class wi to the text line s. If the distance d(wi,s) of a keyword to the text line is below a given threshold, the text line s and the word from s having the minimum distance is returned as a positive match to the keyword wi.

Anna Tonazzini | Emanuele Salerno

[1] R. Manmatha,et al. Word spotting for historical documents , 2006, International Journal of Document Analysis and Recognition (IJDAR).

[2] Kaspar Riesen,et al. Approximate graph edit distance computation by means of bipartite graph matching , 2009, Image Vis. Comput..