Tracking multiple construction workers through deep learning and the gradient based method with re-matching based on multi-object tracking accuracy

Abstract Multiple construction worker tracking is an active research area critical to the planning of the job site. Challenges in multiple construction worker tracking include miss detection and mismatch due to occlusion and identity switches. To the best knowledge of the authors, the mismatch is not reported in the literature of construction for image based single camera multiple worker tracking. As a result, the mismatch should be taken into account through a representative performance index such as the Multi-Object Tracking Accuracy (MOTA). This work aims to improve the performance of the current multiple worker tracking through an approach composed of three stages: detection, matching and re-matching. In the detection stage, the deep learning detector, Mask R-CNN, is utilized. In the matching stage, we attempt to track workers between consecutive image frames through a gradient based method with feature based comparison. Several cost means and matching methods have been experimented for model selection. Trajectories of tracking objects are derived in this stage. The best cost measurements and matching methods are recommended. Trajectories of tracking objects could be interrupted because of miss detection or mismatch. We call those broken trajectories, without matched detections, orphans. In the re-matching stage, we attempt to recover unmatched detections in the current frame with previous orphans based on extracted features. A competitive MOTA of 56.7% was obtained from the proposed approach over MOTA of 55.9% from the state-of-the-art Detect-And-Track model on a human tracking benchmark dataset. On construction job sites, we have tested the approach with 4 testing videos, resulting in a total MOTA of 81.8%, average MOTA per video of 79.0% and standard deviation of 13.0%, while the maximum and minimum MOTAs are 96.0% and 69.0%, respectively. As a result, the proposed work could potentially provide better multiple worker tracking on the construction job site. Additionally, to have a better representation of the tracking errors, this work suggests to utilize the MOTA for multiple construction worker tracking.

[1]  Juan de Lara,et al.  Supporting user-oriented analysis for multi-view domain-specific visual languages , 2009, Inf. Softw. Technol..

[2]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  SangUk Han,et al.  A vision-based motion capture and recognition framework for behavior-based safety management , 2013 .

[4]  Fabio Tozeto Ramos,et al.  Simple online and realtime tracking , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[5]  Kaiming He,et al.  Mask R-CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[6]  Ioannis Brilakis,et al.  Real-time simulation of construction workers using combined human body and hand tracking for robotic construction worker system , 2018 .

[7]  Jin-Sun Lim,et al.  Real-Time Location Tracking of Multiple Construction Laborers , 2016, Sensors.

[8]  Peter E. D. Love,et al.  Automated detection of workers and heavy equipment on construction sites: A convolutional neural network approach , 2018, Adv. Eng. Informatics.

[9]  Fernanda Leite,et al.  Construction safety planning: Site-specific temporal and spatial information integration , 2017 .

[10]  Mani Golparvar-Fard,et al.  End-to-end vision-based detection, tracking and activity analysis of earthmoving equipment filmed at ground level , 2019, Automation in Construction.

[11]  Youngjib Ham,et al.  Motion and Visual Data-Driven Distant Object Localization for Field Reporting , 2018 .

[12]  Mani Golparvar-Fard,et al.  Automated 2D detection of construction equipment and workers from site video streams using histograms of oriented gradients and colors , 2013 .

[13]  Mani Golparvar-Fard,et al.  Visual monitoring of civil infrastructure systems via camera-equipped Unmanned Aerial Vehicles (UAVs): a review of related works , 2016 .

[14]  Feniosky Peña-Mora,et al.  Comparative Study of Motion Features for Similarity-Based Modeling and Classification of Unsafe Actions in Construction , 2014, J. Comput. Civ. Eng..

[15]  Sisi Zlatanova,et al.  A pedestrian tracking algorithm using grid-based indoor model , 2018 .

[16]  Tang-Hsien Chang,et al.  Traffic Speed Estimation through Data Fusion from Heterogeneous Sources for First Response Deployment , 2014 .

[17]  Joan Lasenby,et al.  Adaptive computer vision-based 2D tracking of workers in complex environments , 2019, Automation in Construction.

[18]  Lorenzo Torresani,et al.  Detect-and-Track: Efficient Pose Estimation in Videos , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[19]  Ren-Jye Dzeng,et al.  Accelerometer-based fall-portent detection algorithm for construction tiling operation , 2017 .

[20]  Khashayar Asadi,et al.  Real-world Mapping of Gaze Fixations Using Instance Segmentation for Road Construction Safety Applications , 2019, ArXiv.

[21]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Patricio A. Vela,et al.  Performance evaluation of ultra wideband technology for construction resource location tracking in harsh environments , 2011 .

[23]  Man-Woo Park,et al.  Hardhat-Wearing Detection for Enhancing On-Site Safety of Construction Workers , 2015 .

[24]  Amir H. Behzadan,et al.  Risk behavior-based trajectory prediction for construction site safety monitoring , 2018 .

[25]  Sungjoo Hwang,et al.  What drives construction workers' acceptance of wearable technologies in the workplace?: Indoor localization and wearable health devices for occupational safety and health , 2017 .

[26]  Ting Huang,et al.  Toward BIM-Enabled Decision Making for In-Building Response Missions , 2015, IEEE Transactions on Intelligent Transportation Systems.

[27]  Ioannis Brilakis,et al.  Continuous localization of construction workers via integration of detection and tracking , 2016 .

[28]  Xiaochun Luo,et al.  Recognizing Diverse Construction Activities in Site Images via Relevance Networks of Construction-Related Objects Detected by Convolutional Neural Networks , 2018, J. Comput. Civ. Eng..

[29]  Youngjib Ham,et al.  Vision-based nonintrusive context documentation for earthmoving productivity simulation , 2019, Automation in Construction.

[30]  Yong-Ju Lee,et al.  3D tracking of multiple onsite workers based on stereo vision , 2019, Automation in Construction.

[31]  Vineet R. Kamat,et al.  Trajectory Prediction of Mobile Construction Resources Toward Pro-active Struck-by Hazard Detection , 2019, Proceedings of the 36th International Symposium on Automation and Robotics in Construction (ISARC).

[32]  Zhi Chen,et al.  Integrated detection and tracking of workforce and equipment from construction jobsite videos , 2017 .

[33]  Arash Shahi,et al.  Onsite 3D marking for construction activity tracking , 2013 .

[34]  Peter E.D. Love,et al.  A deep hybrid learning model to detect unsafe behavior: Integrating convolution neural networks and long short-term memory , 2018 .

[35]  Jochen Teizer,et al.  Heat map generation for predictive safety planning: preventing struck-by and near miss interactions between workers-on-foot and construction equipment , 2016 .

[36]  Greg Welch,et al.  An Introduction to Kalman Filter , 1995, SIGGRAPH 2001.

[37]  Alex Albert,et al.  Automating and scaling personalized safety training using eye-tracking data , 2018, Automation in Construction.

[38]  Seokho Chi,et al.  Automated Object Identification Using Optical Video Cameras on Construction Sites , 2011, Comput. Aided Civ. Infrastructure Eng..

[39]  Alejandro F. Frangi,et al.  Medical Image Computing and Computer-Assisted Intervention -- MICCAI 2015 , 2015, Lecture Notes in Computer Science.

[40]  Hyojoo Son,et al.  Detection of construction workers under varying poses and changing background in image sequences via very deep residual networks , 2019, Automation in Construction.

[41]  Xiaochun Luo,et al.  A deep learning-based method for detecting non-certified work on construction sites , 2018, Adv. Eng. Informatics.

[42]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[43]  H. Kuhn The Hungarian method for the assignment problem , 1955 .

[44]  Bernt Schiele,et al.  2D Human Pose Estimation: New Benchmark and State of the Art Analysis , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[45]  Xiaowei Luo,et al.  Transfer learning and deep convolutional neural networks for safety guardrail detection in 2D images , 2018 .

[46]  SangUk Han,et al.  Tracking-based 3D human skeleton extraction from stereo video camera toward an on-site safety and ergonomic analysis , 2016 .

[47]  Ioannis Brilakis,et al.  Comparative study of vision tracking methods for tracking of construction site resources , 2011 .

[48]  Ioannis Brilakis,et al.  Construction worker detection in video frames for initializing vision trackers , 2012 .

[49]  Ioannis Brilakis,et al.  Detection of large-scale concrete columns for automated bridge inspection , 2010 .

[50]  Xiaochun Luo,et al.  Detecting non-hardhat-use by a deep learning method from far-field surveillance videos , 2018 .

[51]  Stefan Roth,et al.  People-tracking-by-detection and people-detection-by-tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[52]  Matthew R. Hallowell,et al.  Wearable technology for personalized construction safety monitoring and trending: Review of applicable devices , 2018 .

[53]  Rainer Stiefelhagen,et al.  Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics , 2008, EURASIP J. Image Video Process..

[54]  Yu-Fu Lin,et al.  Variable guidance for pedestrian evacuation considering congestion, hazard, and compliance behavior , 2017 .

[55]  James C. Chu,et al.  TDVRP and BIM Integrated Approach for In-Building Emergency Rescue Routing , 2016, J. Comput. Civ. Eng..

[56]  Jie Gong,et al.  An object recognition, tracking, and contextual reasoning-based video interpretation method for rapid productivity analysis of construction operations , 2011 .

[57]  Ioannis Brilakis,et al.  Concrete Column Recognition in Images and Videos , 2010, J. Comput. Civ. Eng..

[58]  Cheng-Ta Lee,et al.  In-building automated external defibrillator location planning and assessment through building information models , 2019, Automation in Construction.

[59]  Yi Yang,et al.  Articulated Human Detection with Flexible Mixtures of Parts , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[60]  SangHyun Lee,et al.  Computer vision techniques for construction safety and health monitoring , 2015, Adv. Eng. Informatics.