Real-time embedded skew detection and frame removal

It is common to observe document skew and frame artifacts while photocopying and scanning documents. The motivation of this work is to embed skew correction and frame removal in the copy pipeline of a device to achieve ‘one touch’ cleanup. The two challenges that this poses are the need for: (a) substantially reducing computation and memory requirements and (b) minimizing the false positives. Peripheral document features, such as, page/content edges are low-complexity document skew predictors, and content-based approaches are of relatively higher complexity skew predictors. But state-of-the-art page edge detection methods fail on low-contrast document images, or for similar scanbed/document background. To minimize false positives required in embedded implementations, we propose: (1) a robust page edge detection algorithm that is a multiplicative combination of gradients and line based page edge detectors, (2) a robust skew detection algorithm that is a linear combination of page/content edge and content based predictors, and (3) a pipeline for skew correction and frame removal that uses these algorithms and has near-100% accuracy over a wide range of document images.