Depth prediction from 2D images: A taxonomy and an evaluation study
暂无分享,去创建一个
Abstract Among the various cues that help us understand and interact with our surroundings, depth is of particular importance. It allows us to move in space and grab objects to complete different tasks. Therefore, depth prediction has been an active research field for decades and many algorithms have been proposed to retrieve depth. Some imitate human vision and compute depth through triangulation on correspondences found between pixels or handcrafted features in different views of the same scene. Others rely on simple assumptions and semantic knowledge of the structure of the scene to get the depth information. Recently, numerous algorithms based on deep learning have emerged from the computer vision community. They implement the same principles as the non-deep learning methods and leverage the ability of deep neural networks of automatically learning important features that help to solve the task. By doing so, they produce new state-of-the-art results and show encouraging prospects. In this article, we propose a taxonomy of deep learning methods for depth prediction from 2D images. We retained the training strategy as the sorting criterion. Indeed, some methods are trained in a supervised manner which means depth labels are needed during training while others are trained in an unsupervised manner. In that case, the models learn to perform a different task such as view synthesis and depth is only a by-product of this learning. In addition to this taxonomy, we also evaluate nine models on two similar datasets without retraining. Our analysis showed that (i) most models are sensitive to sharp discontinuities created by shadows or colour contrasts and (ii) the post processing applied to the results before computing the commonly used metrics can change the model ranking. Moreover, we showed that most metrics agree with each other and are thus redundant.