Models for object detection, recognition, and shape alignment
暂无分享,去创建一个
The grand goal of computer vision is to provide a complete semantic interpretation of an input image by reasoning about the 3d scene that generated it. Object detection, recognition, and alignment are three fundamental vision tasks towards this goal. In this thesis, we develop a series of efficient algorithms to address these problems. The contributions are summarized as follows. (1) We present a two-step algorithm for specific object detection in cluttered background with a few example images and unknown camera poses. Instead of enforcing metric constraints on the local features, we utilize a set of ordering constraints which are powerful enough for the detection task. At the core of this algorithm is a qualitative feature matching scheme which includes an angular ordering constraint in local scale and a graph planarity constraint in global scale. (2) We present a part-based model for object categorization and part localization. The spatial interactions among parts are modeled by Factor Analysis which can be learned from the data. Constrained by the shape prior, part localization proceeds in the image space by using a triangulated Markov random field (TMRF) model. We propose an iterative shape estimation and regularization approach for efficient computation. (3) We propose a boosting procedure for simultaneous multi-view car detection . By combining the multi-class LogitBoost and AdaBoost detectors, we decompose the original problem to view classification and view-specific detection, which can be solved independently. We study various feature representations and weak learners for the boosting algorithms. Extensive experiments demonstrate improved accuracy and detection rate over the traditional algorithms. (4) We propose a Bayesian framework for robust shape alignment. Prior models assume Gaussian observation noise and attempt to fit a regularized shape using all the observed data, such an assumption is vulnerable to outrageous local features and occlusions. We address this problem by using a hypothesis-and-test approach. A Bayesian inference algorithm is developed to generate a large number of shape hypotheses from randomly sampled partial shapes. The hypotheses are then evaluated in the robust estimation framework to find the optimal one. Our model can effectively handle outliers and recover the underlying object shape. The proposed approach is evaluated on a very challenging dataset which spans a wide variety of car types, viewpoints, background scenes, and occlusion patterns.