Detecting, Classifying, and Mapping Retail Storefronts Using Street-level Imagery

Up-to-date listings of retail stores and related building functions are challenging and costly to maintain. We introduce a novel method for automatically detecting, geo-locating, and classifying retail stores and related commercial functions, on the basis of storefronts extracted from street-level imagery. Specifically, we present a deep learning approach that takes storefronts from street-level imagery as input, and directly provides the geo-location and type of commercial function as output. Our method showed a recall of 89.05% and a precision of 88.22% on a real-world dataset of street-level images, which experimentally demonstrated that our approach achieves human-level accuracy while having a remarkable run-time efficiency compared to methods such as Faster Region-Convolutional Neural Networks (Faster R-CNN) and Single Shot Detector (SSD).

[1]  Arnold W. M. Smeulders,et al.  Words Matter: Scene Text for Image Classification and Retrieval , 2017, IEEE Transactions on Multimedia.

[2]  Ernest Valveny,et al.  ICDAR 2015 competition on Robust Reading , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[3]  Samuel L. Smith,et al.  Offline bilingual word vectors, orthogonal transformations and the inverted softmax , 2017, ICLR.

[4]  Rahul Goel,et al.  Estimating city-level travel patterns using street imagery: A case study of using Google Street View in Britain , 2018, PloS one.

[5]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Alessandro Bozzon,et al.  ST-Sem: A Multimodal Method for Points-of-Interest Classification Using Street-Level Imagery , 2019, ICWE.

[7]  Yi Zhu,et al.  Fine-Grained Land Use Classification at the City Scale Using Ground-Level Images , 2018, IEEE Transactions on Multimedia.

[8]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[9]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[10]  Vassilis P. Plagianakos,et al.  A Novel Adaptive Learning Rate Algorithm for Convolutional Neural Network Training , 2017, EANN.

[11]  Larry S. Davis,et al.  Soft-NMS — Improving Object Detection with One Line of Code , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[12]  Christoph Lofi,et al.  Measuring Semantic Similarity and Relatedness with Distributional and Knowledge- based Approaches , 2015 .

[13]  Alessandro Bozzon,et al.  Crowd-Mapping Urban Objects from Street-Level Imagery , 2019, WWW.

[14]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[15]  Jonathan Krause,et al.  Using deep learning and Google Street View to estimate the demographic makeup of neighborhoods across the United States , 2017, Proceedings of the National Academy of Sciences.

[16]  Theo Gevers,et al.  Con-Text: Text Detection for Fine-Grained Object Classification , 2017, IEEE Transactions on Image Processing.

[17]  Dongyoon Han,et al.  Character Region Awareness for Text Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Shuyuan Yang,et al.  A Survey of Deep Learning-Based Object Detection , 2019, IEEE Access.

[19]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[20]  Chang-Tien Lu,et al.  StreetNet: preference learning with convolutional neural network on urban crime perception , 2018, SIGSPATIAL/GIS.

[21]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Param S. Rajpura,et al.  Transfer Learning for Object Detection using State-of-the-Art Deep Neural Networks , 2018, 2018 5th International Conference on Signal Processing and Integrated Networks (SPIN).

[23]  Christian Szegedy,et al.  Large Scale Business Discovery from Street Level Imagery , 2015, ArXiv.

[24]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[25]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[26]  Bolei Zhou,et al.  Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Mani Golparvar-Fard,et al.  Detection, classification, and mapping of U.S. traffic signs using google street view images for roadway inventory management , 2015 .

[29]  Xiang Bai,et al.  TextBoxes++: A Single-Shot Oriented Scene Text Detector , 2018, IEEE Transactions on Image Processing.

[30]  Alexei A. Efros,et al.  What makes Paris look like Paris? , 2015, Commun. ACM.

[31]  Alessandro Laio,et al.  Clustering by fast search and find of density peaks , 2014, Science.

[32]  Alessandro Bozzon,et al.  Social Glass: A Platform for Urban Analytics and Decision-making Through Heterogeneous Social Data , 2015, WWW.

[33]  Lianwen Jin,et al.  A Multi-Object Rectified Attention Network for Scene Text Recognition , 2019, Pattern Recognit..

[34]  Germán Ros,et al.  Street-view change detection with deconvolutional networks , 2016, Autonomous Robots.

[35]  Krzysztof Janowicz,et al.  xNet+SC: Classifying Places Based on Images by Incorporating Spatial Contexts , 2018, GIScience.

[36]  Weixing Zhang,et al.  Urban Forestry & Urban Greening , 2015 .

[37]  Abbas Vafaei,et al.  Vision-based entrance detection in outdoor scenes , 2018, Multimedia Tools and Applications.

[38]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[39]  Carlo Ratti,et al.  Mapping Urban Landscapes Along Streets Using Google Street View , 2017 .

[40]  Shuchang Zhou,et al.  EAST: An Efficient and Accurate Scene Text Detector , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Carlo Ratti,et al.  Mapping the spatial distribution of shade provision of street trees in Boston using Google Street View panoramas , 2018 .

[42]  Yair Movshovitz-Attias,et al.  Ontological supervision for fine grained classification of Street View storefronts , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Peter Kontschieder,et al.  The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).