Joint Estimation of Depth and Pose with IMU-assisted Photometric Loss

Estimation of depth and pose is of research significance. Photometric loss, which can jointly take depth and pose into consideration, is widely used in this field. However, photometric loss jointly estimates depth and pose only based on visual information, which is of difficulty in optimization phase. Considering that IMU (Inertial Measurement Unit) can measure motion status, which has a close relationship with pose, we introduce IMU, named as Pose Hints in this paper, into photometric loss in monocular depth and pose estimation process. Pose Hints provide pose suggestions during optimization phase, so to fuse the two heterogeneous data, we embed a nonlinear optimization module in Convolutional Neural Networks (CNNs) [15,16]. The nonlinear optimization module jointly optimizes depth and pose according to the fused visual-inertial information by minimizing photometric loss. And during the nonlinear optimization process, we get the pseudo-IMU from consecutive poses, which will be compared with Pose Hints to measure the accuracy of pose estimation and then promote the optimization process based on this measurement. We show that using Pose Hints, we can get better estimation results when compared with our baseline.