Robust sequential view planning for object recognition using multiple cameras

While prior relevant research in active object recognition/pose estimation has mostly focused on single-camera systems, we propose two multi-camera solutions to this problem that can enhance object recognition rate, particularly in the presence of occlusion. In the proposed methods, multiple cameras simultaneously acquire images from different view angles of an unknown, randomly occluded object belonging to a set of a priori known objects. By processing the available information within a recursive Bayesian framework at each step, the recognition algorithms attempt to classify the object, if its identity/pose can be determined with a high confidence level. Otherwise, the algorithms would compute the next most informative camera positions for capturing more images. The principle component analysis (PCA) is used to produce a measurement vector based on the acquired images. Occlusions in the images are handled by a novel probabilistic modelling approach that can increase the robustness of the recognition process with respect to structured noise. The camera positions at each recognition step are selected based on two statistical metrics quantifying the quality of the observations, namely the mutual information (MI) and the Cramer-Rao lower bound (CRLB). While the former has also been used in a prior relevant work, the latter is new in the context of object recognition. Extensive Monte Carlo experiments conducted with a two-camera system demonstrate the effectiveness of the proposed approaches.

[1]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[2]  John G. Proakis,et al.  Probability, random variables and stochastic processes , 1985, IEEE Trans. Acoust. Speech Signal Process..

[3]  Bruno O. Shubert,et al.  Random variables and stochastic processes , 1979 .

[4]  Ramakant Nevatia,et al.  Segmented descriptions of 3-D surfaces , 1987, IEEE Journal on Robotics and Automation.

[5]  Bernt Schiele,et al.  Where to look next and what to look for , 1996, Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. IROS '96.

[6]  Joachim Denzler,et al.  Viewpoint selection - a classifier independent learning approach , 2000, 4th IEEE Southwest Symposium on Image Analysis and Interpretation.

[7]  Alex Pentland,et al.  Closed-form solutions for physically-based shape modeling and recognition , 1991, Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  Mubarak Shah,et al.  VISUALLY RECOGNIZING SPEECH USING EIGENSEQUENCES , 1997 .

[9]  Hiroshi Murase,et al.  Learning by a generation approach to appearance-based object recognition , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[10]  Shimon Ullman,et al.  Face Recognition: The Problem of Compensating for Changes in Illumination Direction , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Bernt Schiele,et al.  Transinformation for active object recognition , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[12]  Frank P. Ferrie,et al.  Viewpoint selection by navigation through entropy maps , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[13]  Kevin W. Bowyer,et al.  Computing the orthographic projection aspect graph of solids of revolution , 1990, Pattern Recognit. Lett..

[14]  Alex Pentland,et al.  Photobook: Content-based manipulation of image databases , 1996, International Journal of Computer Vision.

[15]  Takeo Kanade,et al.  Fast template matching based on the normalized correlation by using multiresolution eigenimages , 1994, Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS'94).

[16]  Horst Bischof,et al.  A robust subspace classifier , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[17]  Fred L. Bookstein,et al.  Principal Warps: Thin-Plate Splines and the Decomposition of Deformations , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Andrea Salgian,et al.  Appearance-based object recognition using multiple views , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[19]  Jiri Matas,et al.  The Multimodal Neighborhood Signature for Modeling Object Color Appearance and Applications in Object Recognition and Image Retrieval , 2002, Comput. Vis. Image Underst..

[20]  Thia Kirubarajan,et al.  Estimation with Applications to Tracking and Navigation: Theory, Algorithms and Software , 2001 .

[21]  Joachim Denzler,et al.  Information Theoretic Sensor Data Selection for Active Object Recognition and State Estimation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  V. Kshirsagar,et al.  Face recognition using Eigenfaces , 2011, 2011 3rd International Conference on Computer Research and Development.

[23]  David Casasent,et al.  Feature Space Trajectory Methods for Active Computer Vision , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Timothy F. Cootes,et al.  A Trainable Method of Parametric Shape Description , 1991, BMVC.

[25]  Anil K. Jain,et al.  BONSAI: 3D Object Recognition Using Constrained Search , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Sven J. Dickinson,et al.  Active Object Recognition Integrating Attention and Viewpoint Control , 1997, Comput. Vis. Image Underst..

[27]  Glenn Shafer,et al.  A Mathematical Theory of Evidence , 2020, A Mathematical Theory of Evidence.

[28]  Lawrence Sirovich,et al.  Application of the Karhunen-Loeve Procedure for the Characterization of Human Faces , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Irving Biederman,et al.  Human image understanding: Recent research and a theory , 1985, Comput. Vis. Graph. Image Process..

[30]  J. Cadre,et al.  Planification for Terrain- Aided Navigation , 2002 .

[31]  Alex Pentland,et al.  Mixtures of eigenfeatures for real-time structure from texture , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[32]  W. Eric L. Grimson,et al.  On the sensitivity of geometric hashing , 1990, [1990] Proceedings Third International Conference on Computer Vision.

[33]  Alex Pentland,et al.  Probabilistic Visual Learning for Object Representation , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[34]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[35]  Michael J. Black,et al.  EigenTracking: Robust Matching and Tracking of Articulated Objects Using a View-Based Representation , 1996, International Journal of Computer Vision.

[36]  Frank P. Ferrie,et al.  On the Sequential Accumulation of Evidence , 2004, International Journal of Computer Vision.

[37]  L. Zadeh Fuzzy sets as a basis for a theory of possibility , 1999 .

[38]  M. Tarr,et al.  Mental rotation and orientation-dependence in shape recognition , 1989, Cognitive Psychology.

[39]  Thiagalingam Kirubarajan,et al.  Estimation with Applications to Tracking and Navigation , 2001 .

[40]  Michael Lindenbaum,et al.  Sequential Karhunen-Loeve basis extraction and its application to images , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[41]  Lucas Paletta,et al.  Appearance-based active object recognition , 2000, Image Vis. Comput..

[42]  Bernt Schiele,et al.  Recognition without Correspondence using Multidimensional Receptive Field Histograms , 2004, International Journal of Computer Vision.

[43]  Hiroshi Murase,et al.  Learning and recognition of 3D objects from appearance , 1993, [1993] Proceedings IEEE Workshop on Qualitative Vision.

[44]  B. Kimia,et al.  3D object recognition using shape similiarity-based aspect graph , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[45]  L Sirovich,et al.  Low-dimensional procedure for the characterization of human faces. , 1987, Journal of the Optical Society of America. A, Optics and image science.

[46]  Yehezkel Lamdan,et al.  Geometric Hashing: A General And Efficient Model-based Recognition Scheme , 1988, [1988 Proceedings] Second International Conference on Computer Vision.

[47]  Christoph Bregler,et al.  Eigen-points [image matching] , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[48]  Alex Pentland,et al.  Modal Matching for Correspondence and Recognition , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[49]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[50]  Tal Arbel,et al.  Efficient Discriminant Viewpoint Selection for Active Bayesian Recognition , 2006, International Journal of Computer Vision.

[51]  B. Benhabib,et al.  Optimal camera placement for an active-vision system , 1991, Conference Proceedings 1991 IEEE International Conference on Systems, Man, and Cybernetics.

[52]  Avinash C. Kak,et al.  3-D Object Recognition Using Bipartite Matching Embedded in Discrete Relaxation , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[53]  Anil K. Jain,et al.  3D object recognition using invariant feature indexing of interpretation tables , 1992, CVGIP Image Underst..

[54]  M. Kirby Low-dimensional processing of still and moving images , 1992, [1992] Conference Record of the Twenty-Sixth Asilomar Conference on Signals, Systems & Computers.

[55]  Alex Pentland,et al.  Beyond eigenfaces: probabilistic matching for face recognition , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[56]  Frank P. Ferrie,et al.  Autonomous recognition: driven by ambiguity , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[57]  B. S. Manjunath,et al.  Subset selection for active object recognition , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[58]  B. S. Manjunath,et al.  An Eigenspace Update Algorithm for Image Analysis , 1997, CVGIP Graph. Model. Image Process..

[59]  Dmitry B. Goldgof,et al.  The Scale Space Aspect Graph , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[60]  Patrick J. Flynn,et al.  Eigenshapes for 3D object recognition in range data , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[61]  B. V. K. Vijaya Kumar,et al.  Efficient Calculation of Primary Images from a Set of Images , 1982, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[62]  Y. Bar-Shalom,et al.  Multisensor resource deployment using posterior Cramer-Rao bounds , 2004, IEEE Transactions on Aerospace and Electronic Systems.

[63]  Katsushi Ikeuchi,et al.  Planning multiple observations for object recognition , 2005, International Journal of Computer Vision.

[64]  Vijayan K. Asari,et al.  An adaptive technique for the extraction of object region and boundary from images with complex environment , 2001, Proceedings 30th Applied Imagery Pattern Recognition Workshop (AIPR 2001). Analysis and Understanding of Time Varying Imagery.

[65]  Yochai Konig,et al.  "Eigenlips" for robust speech recognition , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[66]  Wolfram Burgard,et al.  Active mobile robot localization by entropy minimization , 1997, Proceedings Second EUROMICRO Workshop on Advanced Mobile Robots.

[67]  Hiroshi Murase,et al.  Learning, positioning, and tracking visual appearance , 1994, Proceedings of the 1994 IEEE International Conference on Robotics and Automation.

[68]  David W. Jacobs,et al.  In search of illumination invariants , 2001, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[69]  Joachim Denzler,et al.  Appearance-based recognition of 3-D objects by cluttered background and occlusions , 2005, Pattern Recognit..

[70]  Joachim Denzler,et al.  Learning, tracking and recognition of 3D objects , 1994, Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS'94).

[71]  Hiroshi Murase,et al.  Illumination Planning for Object Recognition Using Parametric Eigenspaces , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[72]  Pradeep K. Khosla,et al.  Integrating Sensor Placement and Visual Tracking Strategies , 1993, ISER.

[73]  Subhashis Banerjee,et al.  Active recognition through next view planning: a survey , 2004, Pattern Recognit..

[74]  John Krumm,et al.  Eigenfeatures for planar pose measurement of partially occluded objects , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[75]  Lucas Paletta,et al.  Active object recognition by view integration and reinforcement learning , 2000, Robotics Auton. Syst..

[76]  Hiroshi Murase,et al.  Dimensionality of illumination in appearance matching , 1996, Proceedings of IEEE International Conference on Robotics and Automation.

[77]  Michael Werman,et al.  Robot localization using uncalibrated camera invariants , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[78]  Michael J. Black,et al.  Recognizing Facial Expressions in Image Sequences Using Local Parameterized Models of Image Motion , 1997, International Journal of Computer Vision.

[79]  Avinash C. Kak,et al.  Planning sensing strategies in a robot work cell with multi-sensor capabilities , 1988, Proceedings. 1988 IEEE International Conference on Robotics and Automation.

[80]  Avinash C. Kak,et al.  Knowledge-based robotics , 1987, Proceedings. 1987 IEEE International Conference on Robotics and Automation.

[81]  David W. Capson,et al.  Robust direct visual servo using network-synchronized cameras , 2004, IEEE Transactions on Robotics and Automation.

[82]  Tal Arbel,et al.  A fast discriminant approach to active object recognition and pose estimation , 2004, ICPR 2004.

[83]  Joachim Denzler,et al.  Active computer vision system , 2000, Proceedings Fifth IEEE International Workshop on Computer Architectures for Machine Perception.

[84]  Yiming Ye,et al.  Where to look next in 3D object search , 1995, Proceedings of International Symposium on Computer Vision - ISCV.

[85]  Subhashis Banerjee,et al.  Isolated 3D object recognition through next view planning , 2000, IEEE Trans. Syst. Man Cybern. Part A.

[86]  Marilena Vendittelli,et al.  Real-time map building and navigation for autonomous robots in unknown environments , 1998, IEEE Trans. Syst. Man Cybern. Part B.

[87]  Paul A. Viola,et al.  Alignment by Maximization of Mutual Information , 1997, International Journal of Computer Vision.

[88]  Aaron F. Bobick,et al.  A State-Based Approach to the Representation and Recognition of Gesture , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[89]  Frank P. Ferrie,et al.  Recognizing volumetric objects in the presence of uncertainty , 1994, Proceedings of 12th International Conference on Pattern Recognition.

[90]  W. Eric L. Grimson On the Recognition of Curved Objects , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[91]  Horst Bischof,et al.  Robust recognition of scaled eigenimages through a hierarchical approach , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[92]  Xueyin Lin,et al.  Model-based next view planning by using rules-automatic feature prediction and detection , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[93]  Horst Bischof,et al.  Robust Recognition Using Eigenimages , 2000, Comput. Vis. Image Underst..

[94]  Bernt Schiele,et al.  Probabilistic object recognition using multidimensional receptive field histograms , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[95]  Cordelia Schmid,et al.  Segmenting, Modeling, and Matching Video Clips Containing Multiple Moving Objects , 2007, IEEE Trans. Pattern Anal. Mach. Intell..

[96]  I. Leibowicz,et al.  Radar/ESM tracking of constant velocity target: comparison of batch (MLE) and EKF performance , 2000, Proceedings of the Third International Conference on Information Fusion.

[97]  Cordelia Schmid,et al.  3D Object Modeling and Recognition Using Local Affine-Invariant Image Descriptors and Multi-View Spatial Constraints , 2006, International Journal of Computer Vision.

[98]  B. Frieden,et al.  Physics from Fisher Information: A Unification , 1998 .

[99]  Robert Bergevin,et al.  Generic Object Recognition: Building and Matching Coarse Descriptions from Line Drawings , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[100]  Masatoshi Ishikawa,et al.  An active sensing method using estimated errors for multisensor fusion systems , 1996, IEEE Trans. Ind. Electron..

[101]  Carlos H. Muravchik,et al.  Posterior Cramer-Rao bounds for discrete-time nonlinear filtering , 1998, IEEE Trans. Signal Process..

[102]  T. Kirubarajan,et al.  Multisensor multitarget bias estimation for general asynchronous sensors , 2005, IEEE Transactions on Aerospace and Electronic Systems.

[103]  G. Lepage A new algorithm for adaptive multidimensional integration , 1978 .

[104]  Youfu Li,et al.  A method of automatic sensor placement for robot vision in inspection tasks , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[105]  W. Eric L. Grimson,et al.  Localizing Overlapping Parts by Searching the Interpretation Tree , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[106]  Subhashis Banerjee,et al.  Aspect graph construction with noisy feature detectors , 2003, IEEE Trans. Syst. Man Cybern. Part B.

[107]  Gérard G. Medioni,et al.  Structural Indexing: Efficient 3-D Object Recognition , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[108]  David B. Cooper,et al.  Describing Complicated Objects by Implicit Polynomials , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[109]  Branko Ristic,et al.  Cramer-Rao bound for nonlinear filtering with Pd<1 and its application to target tracking , 2002, IEEE Trans. Signal Process..

[110]  G. Medioni,et al.  Recognizing 3-D Objects Using Surface Descriptions , 1989, [1988 Proceedings] Second International Conference on Computer Vision.

[111]  David Casasent,et al.  Global feature space neural network for active object recognition , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[112]  Rajesh P. N. Rao Dynamic appearance-based recognition , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[113]  Sameer A. Nene,et al.  A simple algorithm for nearest neighbor search in high dimensions , 1997 .

[114]  Stephen M. Omohundro,et al.  Surface Learning with Applications to Lipreading , 1993, NIPS.

[115]  Michael Lindenbaum,et al.  Partial eigenvalue decomposition of large images using spatial temporal adaptive method , 1995, IEEE Trans. Image Process..

[116]  Anil K. Jain,et al.  Evidence-Based Recognition of 3-D Objects , 1988, IEEE Trans. Pattern Anal. Mach. Intell..

[117]  Tapas Kanungo,et al.  Object recognition using appearance-based parts and relations , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[118]  Lucas Paletta,et al.  A Comparison of Probabilistic, Possibilistic and Evidence Theoretic Fusion Schemes for Active Object Recognition , 1999, Computing.

[119]  Anil K. Jain,et al.  CAD-Based Computer Vision: From CAD Models to Relational Graphs , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[120]  John K. Tsotsos,et al.  Active object recognition , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[121]  Joachim Denzler,et al.  Optimal Selection of Camera Parameters for State Estimation of Static Systems: An Information Theoretic Approach , 2000 .

[122]  Branko Ristic,et al.  A comparison of two Crame/spl acute/r-Rao bounds for nonlinear filtering with P/sub d/<1 , 2004, IEEE Transactions on Signal Processing.

[123]  F. Girosi,et al.  Networks for approximation and learning , 1990, Proc. IEEE.

[124]  Bernd Korn,et al.  A task driven 3D object recognition system using Bayesian networks , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[125]  Ramakant Nevatia,et al.  Recognizing 3-D Objects Using Surface Descriptions , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[126]  Arye Nehorai,et al.  Performance bounds for estimating vector systems , 2000, IEEE Trans. Signal Process..

[127]  David C. Hogg,et al.  Learning Flexible Models from Image Sequences , 1994, ECCV.

[128]  Lucas Paletta,et al.  Learning temporal context in active object recognition using Bayesian analysis , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[129]  H. Niemann,et al.  Knowledge based image and speech analysis for service robots , 1999, Proceedings Integration of Speech and Image Understanding.

[130]  Michael J. Swain,et al.  Indexing via color histograms , 1990, [1990] Proceedings Third International Conference on Computer Vision.