A bibliography of object class recognition and object recognition based on visual attention

Object class recognition has exhibited significant progress in recent years and is now an integral component of many machine vision applications. However, object class recognition using visual attention image segmentation is a novel idea, which has only been developed in the past decade. This paper presents a comprehensive survey on object class recognition and object recognition algorithms, in addition to their applications based on visual attention region selection methods that as recently published. Additionally, increased efforts have been directed to the development of a generic method for categorizing all objects in a domain including examples such as Winn’s Method, used to recognize object classes at a glance. The Majority of object class recognition algorithms are highly dependent on shape matching results. The purpose of this review is to provide a comparison among the visual attention (bottom-up and top-down), object recognition (e.g., SIFT, SURF and PCA-SIFT) and object class recognition methods, aimed to researchers identifying the most appropriate method for a particular purpose. This survey is suitable for researchers in the pattern recognition field, providing familiarity with the existing algorithms for object classification from image acquisition steps to final output (i.e., image segmentation, object recognition and object classification). At the end of each part, the challenges, critical analysis table are provided and future directions of every method are suggested for developing new ideas end of this paper. Additionally, this approach allows researchers to find the definition of keywords and to obtain brief knowledge concerning how each method works and what obtained results are for various datasets.

[1]  Antonio Criminisi,et al.  Object categorization by learned universal visual dictionary , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[2]  G. Costantini,et al.  Detection of Moving Objects in a Binocular Video Sequence , 2006, 2006 10th International Workshop on Cellular Neural Networks and Their Applications.

[3]  Oge Marques,et al.  Using visual attention to extract regions of interest in the context of image retrieval , 2006, ACM-SE 44.

[4]  C. Schmid,et al.  Object Class Recognition Using Discriminative Local Features , 2005 .

[5]  Jianping Fan,et al.  Mining Multilevel Image Semantics via Hierarchical Classification , 2008, IEEE Transactions on Multimedia.

[6]  G. Deco,et al.  A hierarchical neural system with attentional top–down enhancement of the spatial resolution for object recognition , 2000, Vision Research.

[7]  S. Yantis,et al.  Visual Attention: Bottom-Up Versus Top-Down , 2004, Current Biology.

[8]  Bernt Schiele,et al.  Efficient Clustering and Matching for Object Class Recognition , 2006, BMVC.

[9]  Luo Juan,et al.  A comparison of SIFT, PCA-SIFT and SURF , 2009 .

[10]  Lauren N. Hecht,et al.  Attentional selection of complex objects: Joint effects of surface uniformity and part structure , 2007, Psychonomic bulletin & review.

[11]  Pietro Perona,et al.  Selective visual attention enables learning and recognition of multiple objects in cluttered scenes , 2005, Comput. Vis. Image Underst..

[12]  Yan Ke,et al.  PCA-SIFT: a more distinctive representation for local image descriptors , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[13]  Christopher K. I. Williams,et al.  Learning About Multiple Objects in Images: Factorial Learning without Factorial Search , 2002, NIPS.

[14]  T. Moore,et al.  Neural Mechanisms of Selective Visual Attention. , 2017, Annual review of psychology.

[15]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[16]  Takeo Kanade,et al.  Object Detection Using the Statistics of Parts , 2004, International Journal of Computer Vision.

[17]  Leslie Pack Kaelbling,et al.  Virtual Training for Multi-View Object Class Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Siwei Lyu,et al.  Mercer kernels for object recognition with local features , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[19]  C. A. Murthy,et al.  A connectionist model for category perception: theory and implementation , 1993, IEEE Trans. Neural Networks.

[20]  Steven M. Seitz,et al.  A Probabilistic Model for Object Recognition, Segmentation, and Non-Rigid Correspondence , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Antonio Criminisi,et al.  Object Class Recognition at a Glance , 2006 .

[22]  Gregory J. Zelinsky,et al.  Classifying objects based on their visual similarity to target categories , 2008 .

[23]  J. Amudha,et al.  Feature Selection in Top-Down Visual Attention Model using WEKA. , 2011 .

[24]  S. Govindarajulu,et al.  A Comparison of SIFT, PCA-SIFT and SURF , 2012 .

[25]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[26]  Hoi-Jun Yoo,et al.  Familiarity based unified visual attention model for fast and robust object recognition , 2010, Pattern Recognit..

[27]  Hermann Ney,et al.  Log-Linear Mixtures for Object Class Recognition , 2009, BMVC.

[28]  Peter Auer,et al.  Generic object recognition with boosting , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Majid Nili Ahmadabadi,et al.  Simultaneous learning of spatial visual attention and physical actions , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[30]  John K. Tsotsos,et al.  Modeling Visual Attention via Selective Tuning , 1995, Artif. Intell..

[31]  Shimon Ullman,et al.  Cross-generalization: learning novel classes from a single example by feature replacement , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[32]  Henrik I. Christensen,et al.  Computational visual attention systems and their cognitive foundations: A survey , 2010, TAP.

[33]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[34]  Antonio Criminisi,et al.  TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context , 2007, International Journal of Computer Vision.

[35]  Sven J. Dickinson,et al.  Canonical Skeletons for Shape Matching , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[36]  Simone Frintrop,et al.  Visual Attention for Object Recognition in Spatial 3D Data , 2004, WAPCV.

[37]  Shimon Ullman,et al.  Object Classification Using a Fragment-Based Representation , 2000, Biologically Motivated Computer Vision.

[38]  Anthony J. Maeder,et al.  Visual attention modeling: Region-of-interest versus fixation patterns , 2009, 2009 Picture Coding Symposium.

[39]  Silvio Savarese,et al.  3D generic object categorization, localization and pose estimation , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[40]  Dongjian He,et al.  A Multi-Descriptor, Multi-Nearest Neighbor Approach for Image Classification , 2010, ICIC.

[41]  P. Perona,et al.  What do we perceive in a glance of a real-world scene? , 2007, Journal of vision.

[42]  Gérard G. Medioni,et al.  Scalable Object Classification Using Range Images , 2011, 2011 International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission.

[43]  Li Fei-Fei,et al.  Simple line drawings suffice for functional MRI decoding of natural scene categories , 2011, Proceedings of the National Academy of Sciences.

[44]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[45]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[46]  Daphna Weinshall,et al.  Object class recognition by boosting a part-based model , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[47]  David G. Lowe,et al.  University of British Columbia. , 1945, Canadian Medical Association journal.

[48]  Patrick Le Callet,et al.  What we see is most likely to be what matters: Visual attention and applications , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[49]  Lihua Yue,et al.  A Color Saliency Model for Salient Objects Detection in Natural Scenes , 2010, MMM.

[50]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[51]  S. Treue Visual attention: the where, what, how and why of saliency , 2003, Current Opinion in Neurobiology.

[52]  Li Fei-Fei Knowledge transfer in learning to recognize visual objects classes , 2006 .

[53]  Dong Wang,et al.  Visual Object Recognition in Diverse Scenes with Multiple Instance Learning , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[54]  R. Desimone,et al.  Neural mechanisms of selective visual attention. , 1995, Annual review of neuroscience.

[55]  Paul A. Viola,et al.  Robust Real-time Object Detection , 2001 .

[56]  Derrick J. Parkhurst,et al.  Modeling the role of salience in the allocation of overt visual attention , 2002, Vision Research.

[57]  Jamie Shotton,et al.  The Layout Consistent Random Field for Recognizing and Segmenting Partially Occluded Objects , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[58]  Wei Zhang,et al.  Object class recognition using multiple layer boosting with heterogeneous features , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[59]  Christof Koch,et al.  Attentional Selection for Object Recognition - A Gentle Way , 2002, Biologically Motivated Computer Vision.

[60]  Mubarak Shah,et al.  Visual attention detection in video sequences using spatiotemporal cues , 2006, MM '06.

[61]  Derek Hoiem,et al.  3D LayoutCRF for Multi-View Object Class Recognition and Segmentation , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[62]  Lixin Fan,et al.  Categorizing Nine Visual Classes using Local Appearance Descriptors , 2004 .

[63]  Tingting Xu,et al.  Autonomous Behavior-Based Switched Top-Down and Bottom-Up Visual Attention for Mobile Robots , 2010, IEEE Transactions on Robotics.

[64]  Takio Kurita,et al.  Image representation for generic object recognition using higher-order local autocorrelation features on posterior probability images , 2012, Pattern Recognit..

[65]  Cordelia Schmid,et al.  Semi-Local Affine Parts for Object Recognition , 2004, BMVC.

[66]  Pushmeet Kohli,et al.  On Detection of Multiple Object Instances Using Hough Transforms , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[67]  Yaoru Sun,et al.  Hierarchical object-based visual attention for machine vision , 2003 .

[68]  Shimon Ullman,et al.  Semantic Hierarchies for Recognizing Objects and Parts , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[69]  Ilkay Ulusoy,et al.  Generative versus discriminative methods for object recognition , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[70]  Jitendra Malik,et al.  Learning Globally-Consistent Local Distance Functions for Shape-Based Image Retrieval and Classification , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[71]  Lu Wang,et al.  2D Conditional Random Fields for Image Classification , 2006, Intelligent Information Processing.