Salient views and view-dependent dictionaries for object recognition

A sparse representation-based approach is proposed to determine the salient views of 3D objects. The salient views are categorized into two groups. The first are boundary representative views that have several visible sides and object surfaces that may be attractive to humans. The second are side representative views that best represent views from sides of an approximating convex shape. The side representative views are class-specific and possess the most representative power compared to other within-class views. Using the concept of characteristic view class, we first present a sparse representation-based approach for estimating the boundary representative views. With the estimated boundaries, we determine the side representative views based on a minimum reconstruction error criterion. Furthermore, to evaluate our method, we introduce the notion of view-dependent dictionaries built from salient views for applications in 3D object recognition and retrieval. The proposed view-dependent dictionaries encode information on geometry across views and representation of the object. Through a series of experiments on four publicly available 3D object datasets, we demonstrate the effectiveness of our approach compared to two existing state-of-the-art algorithms and one baseline method. HighlightsA sparse representation-based approach is proposed to determine the salient views of 3D objects.The salient views are categorized into boundary representative and side representative views.We present a sparse representation-based approach for estimating the boundary representative views.We determine the side representative views based on a minimum reconstruction error criterion.We introduce view-dependent dictionaries for applications in 3D object recognition and retrieval.

[1]  Rama Chellappa,et al.  Dictionary-Based Face Recognition Under Variable Lighting and Pose , 2012, IEEE Transactions on Information Forensics and Security.

[2]  B. S. Manjunath,et al.  Subset selection for active object recognition , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[3]  B. S. Manjunath,et al.  An Eigenspace Update Algorithm for Image Analysis , 1997, CVGIP Graph. Model. Image Process..

[4]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[5]  Y. C. Pati,et al.  Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition , 1993, Proceedings of 27th Asilomar Conference on Signals, Systems and Computers.

[6]  M J Tarr,et al.  What Object Attributes Determine Canonical Views? , 1999, Perception.

[7]  Rama Chellappa,et al.  Salient view selection based on sparse representation , 2012, 2012 19th IEEE International Conference on Image Processing.

[8]  Guillermo Sapiro,et al.  Sparse Representation for Computer Vision and Pattern Recognition , 2010, Proceedings of the IEEE.

[9]  Cordelia Schmid,et al.  A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Silvia Biasotti,et al.  What’s in an image? , 2005, The Visual Computer.

[11]  A. TroppJ. Greed is good , 2006 .

[12]  Rama Chellappa,et al.  In-Plane Rotation and Scale Invariant Clustering Using Dictionaries , 2013, IEEE Transactions on Image Processing.

[13]  C. Gotsman,et al.  What ’ s in an Image ? Towards the Computation of the “ Best ” View of an Object , 2005 .

[14]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[15]  Thomas A. Funkhouser,et al.  The Princeton Shape Benchmark (Figures 1 and 2) , 2004, Shape Modeling International Conference.

[16]  Ramesh Raskar,et al.  Image-based visual hulls , 2000, SIGGRAPH.

[17]  Rama Chellappa,et al.  Dictionary-Based Face Recognition from Video , 2012, ECCV.

[18]  Rama Chellappa,et al.  Synthesis of Silhouettes and Visual Hull Reconstruction for Articulated Humans , 2008, IEEE Transactions on Multimedia.

[19]  Joel A. Tropp,et al.  Greed is good: algorithmic results for sparse approximation , 2004, IEEE Transactions on Information Theory.

[20]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[22]  Michael Elad,et al.  On the Role of Sparse and Redundant Representations in Image Processing , 2010, Proceedings of the IEEE.

[23]  Herbert Freeman,et al.  Characteristic-View Modeling of Curved-Surface Solids , 1996, Int. J. Pattern Recognit. Artif. Intell..

[24]  Stephen J. Wright,et al.  Computational Methods for Sparse Solution of Linear Inverse Problems , 2010, Proceedings of the IEEE.

[25]  Michael J Tarr,et al.  What defines a view? , 2001, Vision Research.

[26]  Alice J. O'Toole,et al.  A video database of moving faces and people , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  H. Freeman,et al.  Object recognition based on characteristic view classes , 1990, [1990] Proceedings. 10th International Conference on Pattern Recognition.

[28]  Herbert Freeman,et al.  Characteristic Views As A Basis For Three-Dimensional Object Recognition , 1982, Other Conferences.

[29]  Thomas A. Funkhouser,et al.  The Princeton Shape Benchmark , 2004, Proceedings Shape Modeling Applications, 2004..

[30]  Rama Chellappa,et al.  Video Précis: Highlighting Diverse Aspects of Videos , 2010, IEEE Transactions on Multimedia.

[31]  Rama Chellappa,et al.  Secure and Robust Iris Recognition Using Random Projections and Sparse Representations , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.