Detection of construction workers under varying poses and changing background in image sequences via very deep residual networks

Abstract Analyzing the location and behavior of construction workers using construction site images has been recognized as a means of providing useful information for safety management and productivity analysis. Although effective utilization of analyzed image data requires accurate and timely detection of workers in complex, continuously changing working environments, the previous methods that detect construction workers still require improvement because of the poor detection performance. This study proposes the use of very deep residual networks to accurately and rapidly detect construction workers under varying poses and against changing backgrounds in image sequences. The architecture of construction worker detection in this study is based on convolutional neural networks (CNNs). The proposed method is divided into two stages: extracting feature maps via very deep residual networks (ResNet-152) and bounding box regression and labeling from the original image via Faster regions with CNN features (R-CNN). The experiments were conducted at actual construction sites by acquiring 1.3-megapixel and 3.1-megapixel images from a movable digital camera to verify the proposed method for images from fixed and moving cameras. Faster R-CNN with ResNet-152 had accuracy, precision, and recall rates of 94.3%, 96.03%, and 98.13% for 3241 images, respectively. The proposed method processed 0.2 s per frame (i.e., 5 frames per second) on average. The results show that it is possible to accurately and rapidly detect multiple workers in construction site images by employing very deep residual networks without relying on limited assumptions about workers' postures, appearance, and background.

[1]  Yanfei Zhong,et al.  Multi-class geospatial object detection based on a position-sensitive balancing framework for high spatial resolution remote sensing imagery , 2018 .

[2]  Palaiahnakote Shivakumara,et al.  A new Histogram Oriented Moments descriptor for multi-oriented moving text detection in video , 2015, Expert Syst. Appl..

[3]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Mani Golparvar-Fard,et al.  Automated 2D detection of construction equipment and workers from site video streams using histograms of oriented gradients and colors , 2013 .

[5]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Larry S. Davis,et al.  Real-time foreground-background segmentation using codebook model , 2005, Real Time Imaging.

[7]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[8]  Zhongke Shi,et al.  Tracking multiple workers on construction sites using video cameras , 2010, Adv. Eng. Informatics.

[9]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[10]  Seokho Chi,et al.  Automated Object Identification Using Optical Video Cameras on Construction Sites , 2011, Comput. Aided Civ. Infrastructure Eng..

[11]  Heikki Kälviäinen,et al.  Framework for Machine Vision Based Traffic Sign Inventory , 2017, SCIA.

[12]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[13]  S. Korea,et al.  Implementation of Man-Hours Measurement System for Construction Work Crews by Image Processing Technology , 2014 .

[14]  Feniosky Peña-Mora,et al.  Vision-Based Detection of Unsafe Actions of a Construction Worker: Case Study of Ladder Climbing , 2013, J. Comput. Civ. Eng..

[15]  Yanfang Ye,et al.  Automatic Detection of Helmet Uses for Construction Safety , 2016, 2016 IEEE/WIC/ACM International Conference on Web Intelligence Workshops (WIW).

[16]  Man-Woo Park,et al.  Hardhat-Wearing Detection for Enhancing On-Site Safety of Construction Workers , 2015 .

[17]  SangHyun Lee,et al.  Computer vision techniques for construction safety and health monitoring , 2015, Adv. Eng. Informatics.

[18]  Pietro Perona,et al.  Pedestrian Detection: An Evaluation of the State of the Art , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Peter E. D. Love,et al.  Automated detection of workers and heavy equipment on construction sites: A convolutional neural network approach , 2018, Adv. Eng. Informatics.

[20]  Peter E.D. Love,et al.  Falls from heights: A computer vision-based approach for safety harness detection , 2018, Automation in Construction.

[21]  Massimo Bertozzi,et al.  Pedestrian detection for driver assistance using multiresolution infrared vision , 2004, IEEE Transactions on Vehicular Technology.

[22]  Jingdao Chen,et al.  Performance evaluation of 3D descriptors for object recognition in construction applications , 2018 .

[23]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[24]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[25]  Hyojoo Son,et al.  Detection of Nearby Obstacles with Monocular Vision for Earthmoving Operations , 2017 .

[26]  Francesc Serratosa,et al.  A probabilistic integrated object recognition and tracking framework , 2009, Expert Syst. Appl..

[27]  Xiaochun Luo,et al.  Recognizing Diverse Construction Activities in Site Images via Relevance Networks of Construction-Related Objects Detected by Convolutional Neural Networks , 2018, J. Comput. Civ. Eng..

[28]  Vicente Milanés Montero,et al.  Vision-based active safety system for automatic stopping , 2012, Expert Syst. Appl..

[29]  Jun Qiu,et al.  Construction worker's awkward posture recognition through supervised motion tensor decomposition , 2017 .

[30]  Zhongke Shi,et al.  A performance evaluation of vision and radio frequency tracking methods for interacting workforce , 2011, Adv. Eng. Informatics.

[31]  Lie Guo,et al.  Pedestrian detection for intelligent transportation systems combining AdaBoost algorithm and support vector machine , 2012, Expert Syst. Appl..

[32]  Kinam Kim,et al.  Image-based construction hazard avoidance system using augmented reality in wearable device , 2017 .

[33]  SangUk Han,et al.  A vision-based motion capture and recognition framework for behavior-based safety management , 2013 .

[34]  Carlos H. Caldas,et al.  Learning and classifying actions of construction workers and equipment using Bag-of-Video-Feature-Words and Bayesian network models , 2011, Adv. Eng. Informatics.

[35]  Larry S. Davis,et al.  Soft-NMS — Improving Object Detection with One Line of Code , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[36]  Xu Zhao,et al.  Led: Localization-Quality Estimation Embedded Detector , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[37]  Qi Tian,et al.  Foreground object detection from videos containing complex background , 2003, MULTIMEDIA '03.

[38]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2015, IEEE Trans. Pattern Anal. Mach. Intell..

[39]  Nigel J. B. McFarlane,et al.  Segmentation and tracking of piglets in images , 1995, Machine Vision and Applications.

[40]  Carl T. Haas,et al.  Identifying poses of safe and productive masons using machine learning , 2017 .

[41]  Thomas J. Armstrong,et al.  Motion Data-Driven Biomechanical Analysis during Construction Tasks on Sites , 2015 .

[42]  Patricio A. Vela,et al.  Construction performance monitoring via still images, time-lapse photos, and video streams: Now, tomorrow, and the future , 2015, Adv. Eng. Informatics.

[43]  Jun Li,et al.  RPN+ fast boosted tree: Combining deep neural network with traditional classifier for pedestrian detection , 2018, 2018 4th International Conference on Computer and Technology Applications (ICCTA).

[44]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[45]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Kinam Kim,et al.  Vision-Based Object-Centric Safety Assessment Using Fuzzy Inference: Monitoring Struck-By Accidents with Moving Objects , 2016, J. Comput. Civ. Eng..

[47]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[48]  Feniosky Peña-Mora,et al.  Comparative Study of Motion Features for Similarity-Based Modeling and Classification of Unsafe Actions in Construction , 2014, J. Comput. Civ. Eng..

[49]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[50]  Jie Gong,et al.  An object recognition, tracking, and contextual reasoning-based video interpretation method for rapid productivity analysis of construction operations , 2011 .

[51]  Yong K. Cho,et al.  A Point Cloud-Vision Hybrid Approach for 3D Location Tracking of Mobile Construction Assets , 2016 .

[52]  Ioannis Brilakis,et al.  Construction worker detection in video frames for initializing vision trackers , 2012 .

[53]  Yu Wang,et al.  Towards Real-Time Object Detection on Embedded Systems , 2018, IEEE Transactions on Emerging Topics in Computing.

[54]  Xiaochun Luo,et al.  Detecting non-hardhat-use by a deep learning method from far-field surveillance videos , 2018 .

[55]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[56]  Hyeran Byun,et al.  Detecting Construction Equipment Using a Region-Based Fully Convolutional Network and Transfer Learning , 2018, J. Comput. Civ. Eng..

[57]  Bernt Schiele,et al.  Detection and Tracking of Occluded People , 2014, International Journal of Computer Vision.

[58]  Zygmunt L. Szpak,et al.  Maritime surveillance: Tracking ships inside a dynamic background using a fast level-set , 2011, Expert Syst. Appl..

[59]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[60]  Jochen Teizer,et al.  Real-time construction worker posture analysis for ergonomics training , 2012, Adv. Eng. Informatics.

[61]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[62]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.