Cascaded attention and grouping for object recognition from video

Object detection is an enabling technology that plays a key role in many application areas, such as content based media retrieval. Cognitive vision systems are here proposed where the focus of attention is directed towards most informative processing. The attentive detection system uses a cascade of increasingly complex classifiers of radial basis functions (RBF) networks for the stepwise identification of regions of interest (ROI) and refined object hypotheses. While the coarse classifier is used to determine first approximations on the ROI, more complex classifiers are used to give sufficiently accurate and consistent pose estimates. Objects are modelled by local appearances and in terms of posterior distributions in eigenspace. The experimental results were led for the automatic detection of brand objects in Formula One broadcasts and clearly illustrate the benefit in applying decision making on attention and probabilistic grouping.

[1]  John K. Tsotsos,et al.  Modeling Visual Attention via Selective Tuning , 1995, Artif. Intell..

[2]  Alberto Del Bimbo,et al.  Taking into Consideration Sports Semantic Annotation of Sports Videos Content-based Multimedia Indexing and Retrieval , 2002 .

[3]  Bernt Schiele,et al.  Recognition without Correspondence using Multidimensional Receptive Field Histograms , 2004, International Journal of Computer Vision.

[4]  C. Koch,et al.  Computational modelling of visual attention , 2001, Nature Reviews Neuroscience.

[5]  Michael J. Swain,et al.  Color indexing , 1991, International Journal of Computer Vision.

[6]  T. Poggio,et al.  A network that learns to recognize three-dimensional objects , 1990, Nature.

[7]  G. Humphreys A multi-stage account of binding in vision: Neuropsychological evidence , 2001 .

[8]  Tomaso A. Poggio,et al.  Example-Based Object Detection in Images by Components , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Francisco J. Vico,et al.  Residual Q-Learning Applied to Visual Attention , 1996, ICML.

[10]  Lucas Paletta,et al.  Active object recognition by view integration and reinforcement learning , 2000, Robotics Auton. Syst..

[11]  Stepán Obdrzálek,et al.  Object Recognition using Local Affine Frames on Distinguished Regions , 2002, BMVC.

[12]  Antonio Torralba,et al.  Statistical Context Priming for Object Detection , 2001, ICCV.

[13]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[14]  Lucas Paletta,et al.  Context Based Object Detection from Video , 2003, ICVS.

[15]  Cordelia Schmid,et al.  A structured probabilistic model for recognition , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).