论文信息 - Efficient Multi-cue Scene Segmentation

Efficient Multi-cue Scene Segmentation

This paper presents a novel multi-cue framework for scene segmentation, involving a combination of appearance (grayscale images) and depth cues (dense stereo vision). An efficient 3D environment model is utilized to create a small set of meaningful free-form region hypotheses for object location and extent. Those regions are subsequently categorized into several object classes using an extended multi-cue bag-of-features pipeline. For that, we augment grayscale bag-of-features by bag-of-depth-features operating on dense disparity maps, as well as height pooling to incorporate a 3D geometric ordering into our region descriptor.

[1] Cristian Sminchisescu,et al. Semantic Segmentation with Second-Order Pooling , 2012, ECCV.

[2] H. Hirschmüller. Ieee Transactions on Pattern Analysis and Machine Intelligence 1 Stereo Processing by Semi-global Matching and Mutual Information , 2022 .

[3] Trevor Darrell,et al. The pyramid match kernel: discriminative classification with sets of image features , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[4] David A. McAllester,et al. Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5] Marc Pollefeys,et al. Combining Monocular and Stereo Cues for Mobile Robot Localization Using Visual Words , 2010, 2010 20th International Conference on Pattern Recognition.

[6] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[7] Jana Kosecka,et al. Semantic segmentation of street scenes by superpixel co-occurrence and 3D geometry , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[8] Philip H. S. Torr,et al. Combining Appearance and Structure from Motion Features for Road Scene Understanding , 2009, BMVC.

[9] Ruigang Yang,et al. Semantic Segmentation of Urban Scenes Using Dense Depth Maps , 2010, ECCV.

[10] Roberto Cipolla,et al. Segmentation and Recognition Using Structure from Motion Point Clouds , 2008, ECCV.

[11] Matthijs C. Dorst. Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[12] Andrea Vedaldi,et al. Vlfeat: an open and portable library of computer vision algorithms , 2010, ACM Multimedia.

[13] Jitendra Malik,et al. Perceptual Organization and Recognition of Indoor Scenes from RGB-D Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[14] Cordelia Schmid,et al. Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[15] Pieter Abbeel,et al. A textured object recognition pipeline for color and depth image data , 2012, 2012 IEEE International Conference on Robotics and Automation.

[16] Frédéric Jurie,et al. Fast Discriminative Visual Codebooks using Randomized Clustering Forests , 2006, NIPS.

[17] Matthieu Guillaumin,et al. Segmentation Propagation in ImageNet , 2012, ECCV.

[18] Stefano Soatto,et al. Class segmentation and object localization with superpixel neighborhoods , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[19] Sergio Escalera,et al. BoVDW: Bag-of-Visual-and-Depth-Words for gesture recognition , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[20] Luc Van Gool,et al. The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[21] Hans-Peter Kriegel,et al. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[22] W. F. Clocksin,et al. Joint Optimization for Object Class Segmentation and Dense Stereo Reconstruction , 2011, International Journal of Computer Vision.

[23] C. V. Jawahar,et al. Scene Text Recognition using Higher Order Language Priors , 2009, BMVC.

[24] Uwe Franke,et al. Towards a Global Optimal Multi-Layer Stixel Representation of Dense 3D Data , 2011, BMVC.

[25] W. F. Clocksin,et al. Joint Optimization for Object Class Segmentation and Dense Stereo Reconstruction , 2012, International Journal of Computer Vision.

[26] Andrew J. Davison,et al. Active Matching , 2008, ECCV.

[27] Jianxin Wu,et al. A Fast Dual Method for HIK SVM Learning , 2010, ECCV.

[28] Heiko Hirschmüller,et al. Stereo Processing by Semiglobal Matching and Mutual Information , 2008, IEEE Trans. Pattern Anal. Mach. Intell..

[29] Jitendra Malik,et al. Semantic segmentation using regions and parts , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[30] Antonio Criminisi,et al. TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context , 2007, International Journal of Computer Vision.

[31] Pascal Fua,et al. SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32] Dariu Gavrila,et al. A Multilevel Mixture-of-Experts Framework for Pedestrian Classification , 2011, IEEE Transactions on Image Processing.

[33] Jenny Benois-Pineau,et al. Segmentation-based multi-class semantic object detection , 2012, Multimedia Tools and Applications.

[34] Thomas Deselaers,et al. ClassCut for Unsupervised Class Segmentation , 2010, ECCV.

[35] Cordelia Schmid,et al. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).