A rasterized building footprint dataset for the United States

Microsoft released a U.S.-wide vector building dataset in 2018. Although the vector building layers provide relatively accurate geometries, their use in large-extent geospatial analysis comes at a high computational cost. We used High-Performance Computing (HPC) to develop an algorithm that calculates six summary values for each cell in a raster representation of each U.S. state, excluding Alaska and Hawaii: (1) total footprint coverage, (2) number of unique buildings intersecting each cell, (3) number of building centroids falling inside each cell, and area of the (4) average, (5) smallest, and (6) largest area of buildings that intersect each cell. These values are represented as raster layers with 30 m cell size covering the 48 conterminous states. We also identify errors in the original building dataset. We evaluate precision and recall in the data for three large U.S. urban areas. Precision is high and comparable to results reported by Microsoft while recall is high for buildings with footprints larger than 200 m2 but lower for progressively smaller buildings. Measurement(s) building • building footprint • area • building count Technology Type(s) computational modeling technique Sample Characteristic - Environment city Sample Characteristic - Location contiguous United States of America Machine-accessible metadata file describing the reported data: https://doi.org/10.6084/m9.figshare.12444776

[1]  Gaël Varoquaux,et al.  The NumPy Array: A Structure for Efficient Numerical Computation , 2011, Computing in Science & Engineering.

[2]  M. Heris Evaluating metropolitan spatial development: a method for identifying settlement types and depicting growth patterns , 2017 .

[3]  Nedyomukti Imam Syafii,et al.  Evaluation of the impact of the surrounding urban morphology on building energy consumption , 2011 .

[4]  Larissa Larsen,et al.  How factors of land use/land cover, building configuration, and adjacent heat sources and sinks explain Urban Heat Islands in Chicago , 2014 .

[5]  Thomas Leduc,et al.  Towards Urban Fabrics Characterization Based on Buildings Footprints , 2012, AGILE Conf..

[6]  Limin Yang,et al.  A new generation of the United States National Land Cover Database: Requirements, research priorities, design, and implementation strategies , 2018, ISPRS Journal of Photogrammetry and Remote Sensing.

[7]  S. Guhathakurta,et al.  Impact of urban form and design on mid-afternoon microclimate in Phoenix Local Climate Zones , 2014 .

[8]  Masashi Matsuoka,et al.  Multi-scale solution for building extraction from LiDAR and image data , 2009, Int. J. Appl. Earth Obs. Geoinformation.

[9]  Mario A. Storti,et al.  MPI for Python , 2005, J. Parallel Distributed Comput..

[10]  Alain Bertaud The Spatial Organization of Cities: Deliberate Outcome or Unforeseen Consequence? , 2004 .

[11]  S. Pickett,et al.  Spatial heterogeneity in urban ecosystems: reconceptualizing land cover and a framework for classification , 2007 .

[12]  J. Whitehand British urban morphology: the Conzenian tradition , 2001 .

[13]  Takaya Saito,et al.  The Precision-Recall Plot Is More Informative than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets , 2015, PloS one.