An Evaluation of Different Methods for 3D-Driver-Body-Pose Estimation

Driver monitoring systems are increasingly introduced in modern commercial vehicles. Their importance will rise with automated vehicles, requiring the driver to pay attention or to take over in a timely manner. With the success of deep learning methods for human body pose estimation, these systems are also more and more employed in research projects for driver monitoring. However, their accuracy for driver body pose estimation is not yet evaluated thoroughly. We therefore annotate a part of the Drive&Act dataset [1] and evaluate both 2D- and 3D-body-pose performance based on triangulation and depth images. To this end we also introduce a deep learning based post processing step for depth image based 3D-pose-estimation that can be applied without much cost to the result of any 2D-pose detector, lifting the pose prediction to 3D. Our evaluation gives an overview of the performance of current state of the art methods and shows that our depth post processing method can close the gap to triangulation based methods using complex camera setups.