Automatic Document Logo Detection

Automatic logo detection and recognition continues to be of great interest to the document retrieval community as it enables effective identification of the source of a document. In this paper, we propose a new approach to logo detection and extraction in document images that robustly classifies and precisely localizes logos using a boosting strategy across multiple image scales. At a coarse scale, a trained Fisher classifier performs initial classification using features from document context and connected components. Each logo candidate region is further classified at successively finer scales by a cascade of simple classifiers, which allows false alarms to be discarded and the detected region to be refined. Our approach is segmentation free and lay-out independent. We define a meaningful evaluation metric to measure the quality of logo detection using labeled groundtruth. We demonstrate the effectiveness of our approach using a large collection of real-world documents.

[1]  R. Yager,et al.  Approximate Clustering Via the Mountain Method , 1994, IEEE Trans. Syst. Man Cybern. Syst..

[2]  J. Schroeder,et al.  Logo recognition using retinal coding , 2004, Conference Record of the Thirty-Eighth Asilomar Conference on Signals, Systems and Computers, 2004..

[3]  Song Mao,et al.  Empirical Performance Evaluation Methodology and Its Application to Page Segmentation Algorithms , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Shlomo Argamon,et al.  Building a test collection for complex document information processing , 2006, SIGIR.

[5]  Francesca Cesarini,et al.  A neural-based architecture for spot-noisy logo recognition , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[6]  Gerd Maderlechner,et al.  Logo and word matching using a general approach to signal registration , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[7]  Ehud Rivlin,et al.  Applying algebraic and differential invariants for logo recognition , 1996 .

[8]  Sandy Irani,et al.  LOGO DETECTION IN DOCUMENT IMAGES , 1997 .

[9]  Ken Turkowski,et al.  Filters for common resampling tasks , 1990 .

[10]  Hanan Samet,et al.  Integration of local and global shape analysis for logo classification , 2002, Pattern Recognit. Lett..

[11]  Tuan D. Pham Unconstrained logo detection in document images , 2003, Pattern Recognit..

[12]  George Nagy,et al.  HIERARCHICAL REPRESENTATION OF OPTICALLY SCANNED DOCUMENTS , 1984 .