Automatic Target Detection in High-Resolution Remote Sensing Images Using Spatial Sparse Coding Bag-of-Words Model

Automatic detection for targets with complex shape in high-resolution remote sensing images is a challenging task. In this letter, we propose a new detection framework based on spatial sparse coding bag-of-words (BOW) (SSCBOW) model to solve this problem. Specifically, after selecting a processing unit by the sliding window and extracting features, a new spatial mapping strategy is used to encode the geometric information, which not only represents the relative position of the parts of a target but also has the ability to handle rotation variations. Moreover, instead of K-means for visual-word encoding in the traditional BOW model, sparse coding is introduced to achieve a much lower reconstruction error. Finally, the SSCBOW representation is combined with linear support vector machine for target detection. The experimental results demonstrate the precision and robustness of our detection method based on the SSCBOW model.

[1]  Michael Elad,et al.  From Sparse Solutions of Systems of Equations to Sparse Modeling of Signals and Images , 2009, SIAM Rev..

[2]  Nelson H. C. Yung,et al.  Curvature scale space corner detector with adaptive threshold and dynamic region of support , 2004, ICPR 2004.

[3]  Timothy F. Cootes,et al.  An Information Theoretic Approach to Statistical Shape Modelling , 2001, BMVC.

[4]  Andreas S. Weigend,et al.  A neural network approach to topic spotting , 1995 .

[5]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[6]  Timothy F. Cootes,et al.  An information theoretic approach to shape modelling , 2001 .

[7]  Mihai Datcu,et al.  Semantic Annotation of Satellite Images Using Latent Dirichlet Allocation , 2010, IEEE Geoscience and Remote Sensing Letters.

[8]  Deren Li,et al.  Object Classification of Aerial Images With Bag-of-Visual Words , 2010, IEEE Geoscience and Remote Sensing Letters.

[9]  Navneet Dalal,et al.  Finding People in Images and Videos , 2006 .

[10]  Xu Ting-fa Study on the algorithm for automatic plane classification from remote sensing images with mid-high resolution , 2006 .

[11]  Shimon Ullman,et al.  Object recognition with informative features and linear classification , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[12]  Jacob Goldberger,et al.  Urban-Area Segmentation Using Visual Words , 2009, IEEE Geoscience and Remote Sensing Letters.

[13]  Changhu Wang,et al.  Spatial-bag-of-features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[14]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[15]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[16]  Michael Brady,et al.  Saliency, Scale and Image Description , 2001, International Journal of Computer Vision.

[17]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[18]  Shukui Bo,et al.  Region-based airplane detection in remotely sensed imagery , 2010, 2010 3rd International Congress on Image and Signal Processing.

[19]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[20]  Dan Roth,et al.  Learning a Sparse Representation for Object Detection , 2002, ECCV.