A Comparison of 2D-3D Pose Estimation Methods

This thesis describes pose estimation as an increasingly used area in augmentation and tracking with many different solutions and methods that constantly undergo optimization and each has drawbacks and benefits. But the aim is always speed, accuracy or both when it comes to real applications. Pose estimation is used in many areas but primarily tracking and augmentation issues, where another large area of finding 2D-2D correspondences is crucial research area today. Software like ARToolKit [11] tracks a flat marker and is able to draw 3D objects on top of it for augmentation purposes. It is very fast, because the accuracy is not the largest issue when the eye has to judge if it looks real or augmented. But the speed must be high for the eye to see it as real as the background. There is not really a common standard of how to compare methods for pose estimation and there is no standard method to compare with. In this thesis efford is made to get a fair comparison and there is included a simple very known method as comparator. In total there is 4 methods tested, they calculate the perspective from known 2D-3D correspondences from image to point cloud. All have different limitations such as minimum amount of 2D-3D correspondence pairs or sensitivity to noise that makes it unpredictable in noisy conditions. The benefits and drawbacks are listed for each method for easy comparison. The 3 methods are nonlinear CPC, PosIt and PosIt for coplanar points, while DLT is a linear method that is used because it is easy to implement and good for comparison. All tests are done on fictive data to allow some extreme cases and to have ground truth for accurate comparisons. In short the tests made are: Noise ii test, increased number of points, planarity issues, distance to object and initial guesses. The findings were many and shows that the methods are working very differently. So when choosing a method, one has to consider the application of it, and what data is available to the method.

[1]  Larry S. Davis,et al.  Iterative Pose Estimation Using Coplanar Feature Points , 1996, Comput. Vis. Image Underst..

[2]  Vincent Lepetit,et al.  Monocular Model-Based 3D Tracking of Rigid Objects: A Survey , 2005, Found. Trends Comput. Graph. Vis..

[3]  Hirokazu Kato,et al.  Marker tracking and HMD calibration for a video-based augmented reality conferencing system , 1999, Proceedings 2nd IEEE and ACM International Workshop on Augmented Reality (IWAR'99).

[4]  O. Nelles,et al.  An Introduction to Optimization , 1996, IEEE Antennas and Propagation Magazine.

[5]  Helder Araújo,et al.  A Fully Projective Formulation to Improve the Accuracy of Lowe's Pose-Estimation Algorithm , 1998, Comput. Vis. Image Underst..

[6]  Larry S. Davis,et al.  Model-based object pose in 25 lines of code , 1992, International Journal of Computer Vision.

[7]  H. M. Karara,et al.  Direct Linear Transformation from Comparator Coordinates into Object Space Coordinates in Close-Range Photogrammetry , 2015 .

[8]  Zhengyou Zhang,et al.  Flexible camera calibration by viewing a plane from unknown orientations , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[9]  Vincent Lepetit,et al.  Accurate Non-Iterative O(n) Solution to the PnP Problem , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[10]  Daniel Grest,et al.  Marker free human motion capture in dynamic cluttered environments from a single view point , 2008 .

[11]  Ronald Azuma,et al.  A Survey of Augmented Reality , 1997, Presence: Teleoperators & Virtual Environments.