Multiscale Document Page Segmentation Using Soft Decision Integration

A new algorithm for layout independent document image segmentation is suggested. Text, image and graphics regions in a document image are treated as three diierent \texture" classes. Feature vectors based on multi-scale wavelet packet representation are used for local classiication. Segmentation is performed by propagating soft local decisions made on small windows across neighboring blocks and integrating them to reduce their \ambiguities" and increase their \conndence" as more contextual evidence is obtained from the image data. Local votes propagate in a neighborhood, within and across scales, and majorities of weighted votes give the nal decisions. The method has been tested on document page decomposition tasks, and the results of these tests are presented. The algorithm is general, can be applied to other segmentation and classiication tasks, is based on parallel, distributed and independent computations and has low complexity.