Semantic Classification in Aerial Imagery by Integrating Appearance and Height Information

In this paper we present an efficient technique to obtain accurate semantic classification on the pixel level capable of integrating various modalities, such as color, edge responses, and height information. We propose a novel feature representation based on Sigma Points computations that enables a simple application of powerful covariance descriptors to a multi-class randomized forest framework. Additionally, we include semantic contextual knowledge using a conditional random field formulation. In order to achieve a fair comparison to state-of-the-art methods our approach is first evaluated on the MSRC image collection and is then demonstrated on three challenging aerial image datasets Dallas, Graz, and San Francisco. We obtain a full semantic classification on single aerial images within two minutes. Moreover, the computation time on large scale imagery including hundreds of images is investigated.

[1]  S. Julier,et al.  A General Method for Approximating Nonlinear Transformations of Probability Distributions , 1996 .

[2]  Jeffrey K. Uhlmann,et al.  New extension of the Kalman filter to nonlinear systems , 1997, Defense, Security, and Sensing.

[3]  Paul A. Viola,et al.  Robust Real-time Object Detection , 2001 .

[4]  W. Förstner,et al.  A Metric for Covariance Matrices , 2003 .

[5]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[6]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[7]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[8]  Antonio Criminisi,et al.  Object categorization by learned universal visual dictionary , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[9]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[10]  Lukas Zebedin,et al.  Towards 3D map generation from digital aerial images , 2006 .

[11]  D. Boldo,et al.  A ROBUST ALGORITHM FOR ESTIMATING DIGITAL TERRAIN MODELS FROM DIGITAL SURFACE MODELS IN DENSE URBAN AREAS , 2006 .

[12]  Andreas Klaus,et al.  Segment-Based Stereo Matching Using Belief Propagation and a Self-Adapting Dissimilarity Measure , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[13]  Luc Van Gool,et al.  3D City Modeling Using Cognitive Loops , 2006, Third International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT'06).

[14]  Antonio Criminisi,et al.  TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-class Object Recognition and Segmentation , 2006, ECCV.

[15]  Antti Oulasvirta,et al.  Computer Vision – ECCV 2006 , 2006, Lecture Notes in Computer Science.

[16]  Bill Triggs,et al.  Scene Segmentation with CRFs Learned from Partially Labeled Images , 2007, NIPS.

[17]  Alexei A. Efros,et al.  Recovering Occlusion Boundaries from a Single Image , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[18]  Fatih Murat Porikli,et al.  Human Detection via Classification on Riemannian Manifolds , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Andrea Vedaldi,et al.  Objects in Context , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[20]  Andrew Zisserman,et al.  Image Classification using Random Forests and Ferns , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[21]  Nikos Komodakis,et al.  Approximate Labeling via Graph Cuts Based on Linear Programming , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Andrew J. Davison,et al.  Active Matching , 2008, ECCV.

[23]  Roberto Cipolla,et al.  Segmentation and Recognition Using Structure from Motion Point Clouds , 2008, ECCV.

[24]  Supun Samarasekera,et al.  Building segmentation for densely built urban regions using aerial LIDAR data , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Antonio Criminisi,et al.  Object Class Segmentation using Random Forests , 2008, BMVC.

[26]  Horst Bischof,et al.  Fusion of Feature- and Area-Based Information for Urban Buildings Modeling from Aerial Imagery , 2008, ECCV.

[27]  Josiane Zerubia,et al.  Automatic Building Extraction from DEMs using an Object Approach and Application to the 3D-city Modeling , 2008 .

[28]  Roberto Cipolla,et al.  Semantic texton forests for image categorization and segmentation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Fatih Murat Porikli,et al.  Learning on lie groups for invariant detection and tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.