Towards Robust Object detection in Floor Plan Images: A Data Augmentation Approach

Object detection is one of the most critical tasks in the field of Computer vision. This task comprises identifying and localizing an object in the image. Architectural floor plans represent the layout of buildings and apartments. The floor plans consist of walls, windows, stairs, and other furniture objects. While recognizing floor plan objects is straightforward for humans, automatically processing floor plans and recognizing objects is a challenging problem. In this work, we investigate the performance of the recently introduced Cascade Mask R-CNN network to solve object detection in floor plan images. Furthermore, we experimentally establish that deformable convolution works better than conventional convolutions in the proposed framework. Identifying objects in floor plan images is also challenging due to the variety of floor plans and different objects. We faced a problem in training our network because of the lack of publicly available datasets. Currently, available public datasets do not have enough images to train deep neural networks efficiently. We introduce SFPI, a novel synthetic floor plan dataset consisting of 10000 images to address this issue. Our proposed method conveniently surpasses the previous state-of-the-art results on the SESYD dataset and sets impressive baseline results on the proposed SFPI dataset. The dataset can be downloaded from SFPI Dataset Link. We believe that the novel dataset enables the researcher to enhance the research in this domain further.

[1]  Didier Stricker,et al.  CasTabDetectoRS: Cascade Network for Table Detection in Document Images with Recursive Feature Pyramid and Switchable Atrous Convolution , 2021, J. Imaging.

[2]  Didier Stricker,et al.  Survey and Performance Analysis of Deep Learning Based Object Detection in Challenging Environments , 2021, Sensors.

[3]  Xiaolin Zhang,et al.  Improving More Instance Segmentation and Better Object Detection in Remote Sensing Imagery Based on Cascade Mask R-CNN , 2021, 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS.

[4]  Muhammad Zeshan Afzal,et al.  Cascade Network with Deformable Composite Backbone for Formula Detection in Scanned Document Images , 2021, Applied Sciences.

[5]  S. Luo,et al.  Recognition and detection of aero-engine blade damage based on Improved Cascade Mask R-CNN. , 2021, Applied optics.

[6]  Xiaolei Lv,et al.  Residential floor plan recognition and reconstruction , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  J. Lee,et al.  Object Detection for Understanding Assembly Instruction Using Context-aware Data Augmentation and Cascade Mask R-CNN , 2021, ArXiv.

[8]  Fahad Shahbaz Khan,et al.  D2Det: Towards High Quality Object Detection and Instance Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Hong-Yuan Mark Liao,et al.  YOLOv4: Optimal Speed and Accuracy of Object Detection , 2020, ArXiv.

[10]  Zhi Tang,et al.  CBNet: A Novel Composite Backbone Network Architecture for Object Detection , 2019, AAAI.

[11]  Chi-Wing Fu,et al.  Deep Floor Plan Recognition Using a Multi-Task Network With Room-Boundary-Guided Attention , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[12]  Nuno Vasconcelos,et al.  Cascade R-CNN: High Quality Object Detection and Instance Segmentation , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Kai Chen,et al.  MMDetection: Open MMLab Detection Toolbox and Benchmark , 2019, ArXiv.

[14]  Simone Marinai,et al.  Object Detection in Floor Plan Images , 2018, ANNPR.

[15]  Arturo Téllez-Velázquez,et al.  A CUDA‐streams inference machine for non‐singleton fuzzy systems , 2018, Concurr. Comput. Pract. Exp..

[16]  Jia He,et al.  A Fuzzy Neural Network Based Dynamic Data Allocation Model on Heterogeneous Multi-GPUs for Large-scale Computations , 2018, International Journal of Automation and Computing.

[17]  Shreya Goyal,et al.  Plan2Text: A framework for describing building floor plan images from first person perspective , 2018, 2018 IEEE 14th International Colloquium on Signal Processing & Its Applications (CSPA).

[18]  Björn Stenger,et al.  Parsing floor plan images , 2017, 2017 Fifteenth IAPR International Conference on Machine Vision Applications (MVA).

[19]  Petr Hurtík,et al.  Pattern matching: overview, benchmark and comparison with F-transform general matching algorithm , 2017, Soft Computing.

[20]  Yi Li,et al.  Deformable Convolutional Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[21]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Serge J. Belongie,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Xiaogang Wang,et al.  Object Detection from Video Tubelets with Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Lucile Gimenez,et al.  Automatic reconstruction of 3D building models from scanned 2D floor plans , 2016 .

[26]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Jean-Laurent Hippolyte,et al.  Review: reconstruction of 3D building information models from 2D scanned plans , 2015 .

[29]  Lluís-Pere de las Heras,et al.  CVC-FP and SGT: a new database for structural floor plan analysis and its groundtruthing tool , 2015, International Journal on Document Analysis and Recognition (IJDAR).

[30]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[31]  Lluís-Pere de las Heras,et al.  Statistical segmentation and structural recognition for floor plan interpretation , 2014, International Journal on Document Analysis and Recognition (IJDAR).

[32]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[33]  Ying Zhuge,et al.  GPU-based relative fuzzy connectedness image segmentation. , 2012, Medical physics.

[34]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[35]  Marcus Liwicki,et al.  Automatic Room Detection and Room Labeling from Architectural Floor Plans , 2012, 2012 10th IAPR International Workshop on Document Analysis Systems.

[36]  David M. W. Powers,et al.  Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation , 2011, ArXiv.

[37]  Christopher K. I. Williams,et al.  International Journal of Computer Vision manuscript No. (will be inserted by the editor) The PASCAL Visual Object Classes (VOC) Challenge , 2022 .

[38]  Min Sun,et al.  AugPOD: Augmentation-oriented Probabilistic Object Detection , 2019 .

[39]  International Journal on Document Analysis and Recognition (IJDAR) manuscript No. (will be inserted by the editor) Generation of Synthetic Documents for Performance Evaluation of Symbol Recognition & Spotting Systems , 2022 .