Efficient scale space auto-context for image segmentation and labeling

The conditional random fields (CRF) model, using patch-based classification bound with context information, has been widely adopted for image segmentation/ labeling. In this paper, we propose three components for improving the speed and accuracy, and illustrate them on a developed auto-context algorithm: (1) a new coding scheme for multiclass classification, named data-assisted output code (DAOC); (2) a scale-space approach to make it less sensitive to geometric scale change; and (3) a region-based voting scheme to make it faster and more accurate at object boundaries. The proposed multiclass classifier, DAOC, is general and particularly appealing when the number of class becomes large since it needs a minimal number of [log2 k] binary classifiers for k classes. We show advantages of the DAOC classifier over the existing algorithms on several Irvine repository datasets, as well as vision applications. Combining DAOC, the scale-space approach, and the region-based voting scheme for autocontext, the overall algorithm is significantly faster (5 ~ 10 times) than the original auto-context, with improved accuracy over many of the existing algorithms on theMSRC and VOC 2007 datasets.

[1]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[2]  J. Ross Quinlan,et al.  Improved Use of Continuous Attributes in C4.5 , 1996, J. Artif. Intell. Res..

[3]  Tony Lindeberg,et al.  Scale-Space Theory in Computer Vision , 1993, Lecture Notes in Computer Science.

[4]  Robert E. Schapire,et al.  Using output codes to boost multiclass learning problems , 1997, ICML.

[5]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[6]  Ethem Alpaydin,et al.  Learning error-correcting output codes from data , 1999 .

[7]  Venkatesan Guruswami,et al.  Multiclass learning, boosting, and error-correcting codes , 1999, COLT '99.

[8]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[9]  Yoram Singer,et al.  Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..

[10]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[11]  Gunnar Rätsch,et al.  Adapting Codes and Embeddings for Polychotomies , 2002, NIPS.

[12]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Martial Hebert,et al.  Discriminative random fields: a discriminative framework for contextual interaction in classification , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[14]  R. Zemel,et al.  Multiscale conditional random fields for image labeling , 2004, CVPR 2004.

[15]  Koby Crammer,et al.  On the Learnability and Design of Output Codes for Multiclass Problems , 2002, Machine Learning.

[16]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[17]  Ryan M. Rifkin,et al.  In Defense of One-Vs-All Classification , 2004, J. Mach. Learn. Res..

[18]  Zhuowen Tu,et al.  Probabilistic boosting-tree: learning discriminative models for classification, recognition, and clustering , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[19]  Jian Li,et al.  Unifying the error-correcting and output-code AdaBoost within the margin framework , 2005, ICML.

[20]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[21]  Ling Li,et al.  Multiclass boosting with repartitioning , 2006, ICML.

[22]  Alexei A. Efros,et al.  Using Multiple Segmentations to Discover Objects and their Extent in Image Collections , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[23]  Zhuowen Tu,et al.  A Learning Based Approach for 3D Segmentation and Colon Detagging , 2006, ECCV.

[24]  Zhuowen Tu,et al.  Probabilistic 3D Polyp Detection in CT Images: The Role of Sample Alignment , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[25]  Antonio Criminisi,et al.  TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-class Object Recognition and Segmentation , 2006, ECCV.

[26]  Lin Yang,et al.  Multiple Class Segmentation Using A Unified Framework over Mean-Shift Patches , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Andrea Vedaldi,et al.  Objects in Context , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[28]  Antonio Torralba,et al.  Sharing Visual Features for Multiclass and Multiview Object Detection , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Zhuowen Tu,et al.  Auto-context and its application to high-level vision tasks , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Roberto Cipolla,et al.  Semantic texton forests for image categorization and segmentation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.