Applying Ensembles of Multilinear Classifiers in the Frequency Domain

Ensemble methods such as bootstrap, bagging or boosting have had a considerable impact on recent developments in machine learning, pattern recognition and computer vision. Theoretical and practical results alike have established that, in terms of accuracy, ensembles of weak classifiers generally outperform monolithic solutions. However, this comes at the cost of an extensive training process. The work presented in this paper results from projects on advanced human machine interaction. In scenarios like ours, online learning is a major requirement, and lengthy training is prohibitive. We therefore propose a different approach to ensemble learning. Instead of a set of weak classifiers, we combine strong, separable, multilinear discriminant functions. These are especially suited for computer vision: they train very quickly and allow for rapid classification of image content. Training different classifiers for different contexts or on semantically organized data provides ensembles of experts. We collapse a set of experts into a single multilinear function and thus achieve the same runtime for arbitrarily many classifiers as for a single one. Moreover, carrying out the classification in the frequency domain results in faster framerates. Experiments with image sequences recorded in typical home environments show that our ensemble training schemes yield high accuracy on unconstrained and cluttered data.

[1]  Tamara G. Kolda,et al.  Orthogonal Tensor Decompositions , 2000, SIAM J. Matrix Anal. Appl..

[2]  Dong Xu,et al.  Discriminant analysis with tensor representation , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[3]  Demetri Terzopoulos,et al.  Multilinear image analysis for facial recognition , 2002, Object recognition supported by user interaction for service robots.

[4]  Sven Wachsmuth,et al.  Integration and Coordination in a Cognitive Vision System , 2006, Fourth IEEE International Conference on Computer Vision Systems (ICVS'06).

[5]  Steven G. Johnson,et al.  The Design and Implementation of FFTW3 , 2005, Proceedings of the IEEE.

[6]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.

[7]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[8]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .

[9]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[10]  B. Schiele,et al.  Combined Object Categorization and Segmentation With an Implicit Shape Model , 2004 .

[11]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[12]  Christian Bauckhage,et al.  A cognitive vision system for action recognition in office environments , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[13]  Robert P. W. Duin,et al.  Bagging, Boosting and the Random Subspace Method for Linear Classifiers , 2002, Pattern Analysis & Applications.

[14]  Dan Roth,et al.  Learning to detect objects in images via a sparse, part-based representation , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[16]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[17]  Amnon Shashua,et al.  Linear image coding for regression and classification using the tensor-rank principle , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[18]  John K. Tsotsos,et al.  Separable Linear Classifiers for Online Learning in Appearance Based Object Detection , 2005, CAIP.

[19]  Josef Kittler,et al.  Multiple Classifier System Approach to Model Pruning in Object Recognition , 2004, ECCV.

[20]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.