Merging Scored Bounding Boxes with Gaussian Mixture Model for Object Detection

Object detection has been struggled with the issue that how to localize the accurate position of a target from a large number of scored detections. One of the most widely used methods is non-maximum suppression (NMS). However, the fact that this method can only select the high-score detections locally makes the result sometimes less accurate. In this paper, we propose a novel approach to merge all the scored bounding boxes by Gaussian Mixture Model (GMM) that takes not only the spatial information but also the score of each detection into account. We report experiments on both tasks of pedestrian detection and face detection with publicaly available datasets. The drawbacks of NMS can be overcome at some extent and the proposed method outperforms other conventional methods.

[1]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  D. Borsboom,et al.  The Theoretical Status of Latent Variables , 2003 .