论文信息 - Detecting, Classifying, and Mapping Retail Storefronts Using Street-level Imagery

Detecting, Classifying, and Mapping Retail Storefronts Using Street-level Imagery

Up-to-date listings of retail stores and related building functions are challenging and costly to maintain. We introduce a novel method for automatically detecting, geo-locating, and classifying retail stores and related commercial functions, on the basis of storefronts extracted from street-level imagery. Specifically, we present a deep learning approach that takes storefronts from street-level imagery as input, and directly provides the geo-location and type of commercial function as output. Our method showed a recall of 89.05% and a precision of 88.22% on a real-world dataset of street-level images, which experimentally demonstrated that our approach achieves human-level accuracy while having a remarkable run-time efficiency compared to methods such as Faster Region-Convolutional Neural Networks (Faster R-CNN) and Single Shot Detector (SSD).

Alessandro Bozzon | Sihang Qiu | Geert-Jan Houben | Achilleas Psyllidis | Shahin Sharifi Noorian

[1] Arnold W. M. Smeulders,et al. Words Matter: Scene Text for Image Classification and Retrieval , 2017, IEEE Transactions on Multimedia.

[2] Ernest Valveny,et al. ICDAR 2015 competition on Robust Reading , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[3] Samuel L. Smith,et al. Offline bilingual word vectors, orthogonal transformations and the inverted softmax , 2017, ICLR.

[4] Rahul Goel,et al. Estimating city-level travel patterns using street imagery: A case study of using Google Street View in Britain , 2018, PloS one.

[5] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6] Alessandro Bozzon,et al. ST-Sem: A Multimodal Method for Points-of-Interest Classification Using Street-Level Imagery , 2019, ICWE.

[7] Yi Zhu,et al. Fine-Grained Land Use Classification at the City Scale Using Ground-Level Images , 2018, IEEE Transactions on Multimedia.

[8] Ali Farhadi,et al. YOLOv3: An Incremental Improvement , 2018, ArXiv.

[9] Wei Liu,et al. SSD: Single Shot MultiBox Detector , 2015, ECCV.

[10] Vassilis P. Plagianakos,et al. A Novel Adaptive Learning Rate Algorithm for Convolutional Neural Network Training , 2017, EANN.

[11] Larry S. Davis,et al. Soft-NMS — Improving Object Detection with One Line of Code , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[12] Christoph Lofi,et al. Measuring Semantic Similarity and Relatedness with Distributional and Knowledge- based Approaches , 2015 .

[13] Alessandro Bozzon,et al. Crowd-Mapping Urban Objects from Street-Level Imagery , 2019, WWW.

[14] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.

[15] Jonathan Krause,et al. Using deep learning and Google Street View to estimate the demographic makeup of neighborhoods across the United States , 2017, Proceedings of the National Academy of Sciences.

[16] Theo Gevers,et al. Con-Text: Text Detection for Fine-Grained Object Classification , 2017, IEEE Transactions on Image Processing.

[17] Dongyoon Han,et al. Character Region Awareness for Text Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18] Shuyuan Yang,et al. A Survey of Deep Learning-Based Object Detection , 2019, IEEE Access.

[19] Qiang Yang,et al. A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[20] Chang-Tien Lu,et al. StreetNet: preference learning with convolutional neural network on urban crime perception , 2018, SIGSPATIAL/GIS.

[21] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22] Param S. Rajpura,et al. Transfer Learning for Object Detection using State-of-the-Art Deep Neural Networks , 2018, 2018 5th International Conference on Signal Processing and Integrated Networks (SPIN).

[23] Christian Szegedy,et al. Large Scale Business Discovery from Street Level Imagery , 2015, ArXiv.

[24] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[25] Tomas Mikolov,et al. Enriching Word Vectors with Subword Information , 2016, TACL.

[26] Bolei Zhou,et al. Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27] Ali Farhadi,et al. YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28] Mani Golparvar-Fard,et al. Detection, classification, and mapping of U.S. traffic signs using google street view images for roadway inventory management , 2015 .

[29] Xiang Bai,et al. TextBoxes++: A Single-Shot Oriented Scene Text Detector , 2018, IEEE Transactions on Image Processing.

[30] Alexei A. Efros,et al. What makes Paris look like Paris? , 2015, Commun. ACM.

[31] Alessandro Laio,et al. Clustering by fast search and find of density peaks , 2014, Science.

[32] Alessandro Bozzon,et al. Social Glass: A Platform for Urban Analytics and Decision-making Through Heterogeneous Social Data , 2015, WWW.

[33] Lianwen Jin,et al. A Multi-Object Rectified Attention Network for Scene Text Recognition , 2019, Pattern Recognit..

[34] Germán Ros,et al. Street-view change detection with deconvolutional networks , 2016, Autonomous Robots.

[35] Krzysztof Janowicz,et al. xNet+SC: Classifying Places Based on Images by Incorporating Spatial Contexts , 2018, GIScience.

[36] Weixing Zhang,et al. Urban Forestry & Urban Greening , 2015 .

[37] Abbas Vafaei,et al. Vision-based entrance detection in outdoor scenes , 2018, Multimedia Tools and Applications.

[38] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[39] Carlo Ratti,et al. Mapping Urban Landscapes Along Streets Using Google Street View , 2017 .

[40] Shuchang Zhou,et al. EAST: An Efficient and Accurate Scene Text Detector , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41] Carlo Ratti,et al. Mapping the spatial distribution of shade provision of street trees in Boston using Google Street View panoramas , 2018 .

[42] Yair Movshovitz-Attias,et al. Ontological supervision for fine grained classification of Street View storefronts , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43] Peter Kontschieder,et al. The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).