Document noise removal using sparse representations over learned dictionary

In this paper, we propose an algorithm for denoising document images using sparse representations. Following a training set, this algorithm is able to learn the main document characteristics and also, the kind of noise included into the documents. In this perspective, we propose to model the noise energy based on the normalized cross-correlation between pairs of noisy and non-noisy documents. Experimental results on several datasets demonstrate the robustness of our method compared with the state-of-the-art.

[1]  Michael Elad,et al.  Optimally sparse representation in general (nonorthogonal) dictionaries via ℓ1 minimization , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[2]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[3]  Emmanuel J. Candès,et al.  The curvelet transform for image denoising , 2002, IEEE Trans. Image Process..

[4]  Elisa H. Barney Smith Modeling image degradations for improving OCR , 2008, 2008 16th European Signal Processing Conference.

[5]  Robert M. Haralick,et al.  A Statistical, Nonparametric Methodology for Document Degradation Model Validation , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Michael Elad,et al.  Sparse and Redundant Representations - From Theory to Applications in Signal and Image Processing , 2010 .

[7]  Robert M. Haralick,et al.  Global and local document degradation models , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[8]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[9]  I. Daubechies,et al.  Iteratively reweighted least squares minimization for sparse recovery , 2008, 0807.0575.

[10]  J. P. Lewis Fast Normalized Cross-Correlation , 2010 .

[11]  Y. C. Pati,et al.  Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition , 1993, Proceedings of 27th Asilomar Conference on Signals, Systems and Computers.

[12]  Kjersti Engan,et al.  Frame based signal compression using method of optimal directions (MOD) , 1999, ISCAS'99. Proceedings of the 1999 IEEE International Symposium on Circuits and Systems VLSI (Cat. No.99CH36349).

[13]  Bhaskar D. Rao,et al.  Sparse signal reconstruction from limited data using FOCUSS: a re-weighted minimum norm algorithm , 1997, IEEE Trans. Signal Process..

[14]  Vladimir N. Temlyakov,et al.  Weak greedy algorithms[*]This research was supported by National Science Foundation Grant DMS 9970326 and by ONR Grant N00014‐96‐1‐1003. , 2000, Adv. Comput. Math..

[15]  C. Tappert,et al.  A Survey of Binary Similarity and Distance Measures , 2010 .

[16]  Salvatore Tabbone,et al.  Author manuscript, published in "IEEE International Conference on Image Processing- ICIP'2011 (2011)" EDGE NOISE REMOVAL IN BILEVEL GRAPHICAL DOCUMENT IMAGES USING SPARSE REPRESENTATION , 2011 .

[17]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[18]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[19]  Stéphane Mallat,et al.  Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[20]  E. R. Davies,et al.  Machine vision - theory, algorithms, practicalities , 2004 .

[21]  Petros Maragos,et al.  Morphological filters-Part II: Their relations to median, order-statistic, and stack filters , 1987, IEEE Trans. Acoust. Speech Signal Process..