Individualness and Determinantal Point Processes for Pedestrian Detection

In this paper, we introduce individualness of detection candidates as a complement to objectness for pedestrian detection. The individualness assigns a single detection for each object out of raw detection candidates given by either object proposals or sliding windows. We show that conventional approaches, such as non-maximum suppression, are sub-optimal since they suppress nearby detections using only detection scores. We use a determinantal point process combined with the individualness to optimally select final detections. It models each detection using its quality and similarity to other detections based on the individualness. Then, detections with high detection scores and low correlations are selected by measuring their probability using a determinant of a matrix, which is composed of quality terms on the diagonal entries and similarities on the off-diagonal entries. For concreteness, we focus on the pedestrian detection problem as it is one of the most challenging problems due to frequent occlusions and unpredictable human motions. Experimental results demonstrate that the proposed algorithm works favorably against existing methods, including non-maximal suppression and a quadratic unconstrained binary optimization based method.

[1]  J. Schur Bemerkungen zur Theorie der beschränkten Bilinearformen mit unendlich vielen Veränderlichen. , 1911 .

[2]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[3]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[4]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[5]  Pascal Fua,et al.  Multicamera People Tracking with a Probabilistic Occupancy Map , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Stefan Roth,et al.  People-tracking-by-detection and people-detection-by-tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  P. Bickel,et al.  Efficient blind search: Optimal power of detection under computational cost constraints , 2007, 0712.1663.

[8]  J. Ferryman,et al.  PETS2009: Dataset and challenge , 2009, 2009 Twelfth IEEE International Workshop on Performance Evaluation of Tracking and Surveillance.

[9]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Li Fei-Fei,et al.  Towards total scene understanding: Classification, annotation and segmentation in an automatic framework , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Malik Magdon-Ismail,et al.  On selecting a maximum volume sub-matrix of a matrix and related problems , 2009, Theor. Comput. Sci..

[12]  Ben Taskar,et al.  Structured Determinantal Point Processes , 2010, NIPS.

[13]  Leonidas J. Guibas,et al.  Human action recognition by learning bases of action attributes and parts , 2011, 2011 International Conference on Computer Vision.

[14]  Ben Taskar,et al.  Near-Optimal MAP Inference for Determinantal Point Processes , 2012, NIPS.

[15]  Ben Taskar,et al.  Determinantal Point Processes for Machine Learning , 2012, Found. Trends Mach. Learn..

[16]  Robert T. Collins,et al.  Optimized Pedestrian Detection for Multiple and Occluded People , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Pietro Perona,et al.  Fast Feature Pyramids for Object Detection , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Matthieu Guillaumin,et al.  Non-maximum Suppression for Object Detection by Passing Messages Between Windows , 2014, ACCV.

[20]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Cewu Lu,et al.  Box Aggregation for Proposal Decimation: Last Mile of Object Detection , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[22]  Li Wan,et al.  End-to-end integration of a Convolutional Network, Deformable Parts Model and non-maximum suppression , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).