Constant-time monocular object detection using scene geometry

This paper presents a structured approach for efficiently exploiting the perspective information of a scene to enhance the detection of objects in monocular systems. It defines a finite grid of 3D positions on the dominant ground plane and computes occupancy maps from which object location estimates are extracted . This method works on the top of any detection method, either pixel-wise (e.g. background subtraction) or region-wise (e.g. detection-by-classification) technique, which can be linked to the proposed scheme with minimal fine tuning. Its flexibility thus allows for applying this approach in a wide variety of applications and sectors, such as surveillance applications (e.g. person detection) or driver assistance systems (e.g. vehicle or pedestrian detection). Extensive results provide evidence of its excellent performance and its ease of use in combination with different image processing techniques.

[1]  David Vázquez,et al.  3D-Guided Multiscale Sliding Window for Pedestrian Detection , 2015, IbPRIA.

[2]  Alexei A. Efros,et al.  Putting Objects in Perspective , 2006, CVPR.

[3]  Mohan M. Trivedi,et al.  Efficient Lane and Vehicle Detection with Integrated Synergies (ELVIS) , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[4]  Luc Van Gool,et al.  Pedestrian detection at 100 frames per second , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Sergio A. Velastin,et al.  Vehicle localisation and classification in urban CCTV streams , 2009 .

[6]  Nigel J. B. McFarlane,et al.  Segmentation and tracking of piglets in images , 1995, Machine Vision and Applications.

[7]  Bernt Schiele,et al.  Ten Years of Pedestrian Detection, What Have We Learned? , 2014, ECCV Workshops.

[8]  Pascal Fua,et al.  Multicamera People Tracking with a Probabilistic Occupancy Map , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Yi Li,et al.  R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[10]  Tiziana D'Orazio,et al.  A Semi-automatic System for Ground Truth Generation of Soccer Video Sequences , 2009, 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance.

[11]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[12]  Massimo Bertozzi,et al.  Self-calibration of a stereo vision system for automotive applications , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[13]  James J. Little,et al.  Robust Visual Tracking for Multiple Targets , 2006, ECCV.

[14]  Julián Flórez,et al.  Perspective Multiscale Detection of Vehicles for Real-Time Forward Collision Avoidance Systems , 2013, ACIVS.

[15]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[16]  Yaser Sheikh,et al.  Monocular Object Detection Using 3D Geometric Primitives , 2012, ECCV.

[17]  Sharath Pankanti,et al.  Temporal Non-maximum Suppression for Pedestrian Detection Using Self-Calibration , 2014, 2014 22nd International Conference on Pattern Recognition.

[18]  Tomasz Kryjak,et al.  Real-time background generation and foreground object segmentation for high-definition colour video stream in FPGA device , 2012, Journal of Real-Time Image Processing.

[19]  Alberto Del Bimbo,et al.  Unsupervised Scene Adaptation for Faster Multi-scale Pedestrian Detection , 2014, 2014 22nd International Conference on Pattern Recognition.

[20]  Thierry Bouwmans,et al.  Traditional and recent approaches in background modeling for foreground detection: An overview , 2014, Comput. Sci. Rev..

[21]  Marcos Nieto,et al.  Perspective Multiscale Detection and Tracking of Persons , 2014, MMM.

[22]  Ian D. Reid,et al.  Stable multi-target tracking in real-time surveillance video , 2011, CVPR 2011.

[23]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[24]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[25]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Zoran Zivkovic,et al.  Improved adaptive Gaussian mixture model for background subtraction , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[27]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[28]  Thierry Bouwmans,et al.  BGS Library: A Library Framework for Algorithm’s Evaluation in Foreground/Background Segmentation , 2014 .

[29]  Robert B. Fisher,et al.  The BEHAVE video dataset: ground truthed video for multi-person behavior classification , 2010 .

[30]  Kenneth Y. Goldberg,et al.  Visual tracking of human visitors under variable-lighting conditions for a responsive audio art installation , 2012, 2012 American Control Conference (ACC).

[31]  Pietro Perona,et al.  Fast Feature Pyramids for Object Detection , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Z. Zivkovic Improved adaptive Gaussian mixture model for background subtraction , 2004, ICPR 2004.

[33]  Minglun Gong,et al.  Realtime background subtraction from dynamic scenes , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[34]  Francisco José Madrid-Cuevas,et al.  Automatic generation and detection of highly reliable fiducial markers under occlusion , 2014, Pattern Recognit..

[35]  Takeo Kanade,et al.  Coherent Object Detection with 3D Geometric Context from a Single Image , 2013, 2013 IEEE International Conference on Computer Vision.

[36]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Alberto Del Bimbo,et al.  Person Detection Using Temporal and Geometric Context with a Pan Tilt Zoom Camera , 2010, 2010 20th International Conference on Pattern Recognition.