Exact Acceleration of Linear Object Detectors

We describe a general and exact method to considerably speed up linear object detection systems operating in a sliding, multi-scale window fashion, such as the individual part detectors of part-based models. The main bottleneck of many of those systems is the computational cost of the convolutions between the multiple rescalings of the image to process, and the linear filters. We make use of properties of the Fourier transform and of clever implementation strategies to obtain a speedup factor proportional to the filters' sizes. The gain in performance is demonstrated on the well known Pascal VOC benchmark, where we accelerate the speed of said convolutions by an order of magnitude.

[1]  Bernard Chazelle,et al.  The Bottomn-Left Bin-Packing Heuristic: An Efficient Implementation , 1983, IEEE Transactions on Computers.

[2]  Paul A. Viola,et al.  Robust Real-time Object Detection , 2001 .

[3]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[4]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[5]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[6]  Steven G. Johnson,et al.  The Design and Implementation of FFTW3 , 2005, Proceedings of the IEEE.

[7]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[8]  Luc Van Gool,et al.  The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[9]  Lucas Paletta,et al.  Attention in Cognitive Systems. Theories and Systems from an Interdisciplinary Viewpoint , 2008, Lecture Notes in Computer Science.

[10]  Paul A. Viola,et al.  Multiple-Instance Pruning For Learning Efficient Cascade Detectors , 2007, NIPS.

[11]  Ales Leonardis,et al.  Context Driven Focus of Attention for Object Detection , 2008, WAPCV.

[12]  Christoph H. Lampert,et al.  Beyond sliding windows: Object localization by efficient subwindow search , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Hubert Cecotti,et al.  Convolutional Neural Network with embedded Fourier Transform for EEG classification , 2008, 2008 19th International Conference on Pattern Recognition.

[14]  John K. Tsotsos,et al.  Attention in Cognitive Systems, 5th International Workshop on Attention in Cognitive Systems, WAPCV 2008, Fira, Santorini, Greece, May 12, 2008, Revised Selected Papers , 2009, WAPCV.

[15]  Jitendra Malik,et al.  Object detection using a max-margin Hough transform , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[17]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  David A. McAllester,et al.  Cascade object detection with deformable part models , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19]  Pietro Perona,et al.  The Fastest Pedestrian Detector in the West , 2010, BMVC.

[20]  Daniel P. Huttenlocher,et al.  Distance Transforms of Sampled Functions , 2012, Theory Comput..