Linear Dependency Modeling for Classifier Fusion and Feature Combination

This paper addresses the independent assumption issue in fusion process. In the last decade, dependency modeling techniques were developed under a specific distribution of classifiers or by estimating the joint distribution of the posteriors. This paper proposes a new framework to model the dependency between features without any assumption on feature/classifier distribution, and overcomes the difficulty in estimating the high-dimensional joint density. In this paper, we prove that feature dependency can be modeled by a linear combination of the posterior probabilities under some mild assumptions. Based on the linear combination property, two methods, namely, Linear Classifier Dependency Modeling (LCDM) and Linear Feature Dependency Modeling (LFDM), are derived and developed for dependency modeling in classifier level and feature level, respectively. The optimal models for LCDM and LFDM are learned by maximizing the margin between the genuine and imposter posterior probabilities. Both synthetic data and real datasets are used for experiments. Experimental results show that LCDM and LFDM with dependency modeling outperform existing classifier level and feature level combination methods under nonnormal distributions and on four real databases, respectively. Comparing the classifier level and feature level fusion methods, LFDM gives the best performance.

[1]  Arun Ross,et al.  Score normalization in multimodal biometric systems , 2005, Pattern Recognit..

[2]  A. Bowman,et al.  Applied smoothing techniques for data analysis : the kernel approach with S-plus illustrations , 1999 .

[3]  Horst Bischof,et al.  Online multi-class LPBoost , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4]  Ronen Basri,et al.  Actions as space-time shapes , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[5]  Andrew Zisserman,et al.  Automated Flower Classification over a Large Number of Classes , 2008, 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing.

[6]  Josef Kittler,et al.  Experimental evaluation of expert fusion strategies , 1999, Pattern Recognit. Lett..

[7]  Yuxiao Hu,et al.  Face recognition using Laplacianfaces , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[9]  Cordelia Schmid,et al.  Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Barbara Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[11]  Cordelia Schmid,et al.  Multimodal semi-supervised learning for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[12]  Pong C. Yuen,et al.  Linear dependency modeling for feature fusion , 2011, 2011 International Conference on Computer Vision.

[13]  P. J. Green,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[14]  Anil K. Jain,et al.  Decision-Level Fusion in Fingerprint Verification , 2001, Multiple Classifier Systems.

[15]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[16]  Anil K. Jain,et al.  Unsupervised Learning of Finite Mixture Models , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Barbara Caputo,et al.  Online-batch strongly convex Multi Kernel Learning , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[18]  Robert P. W. Duin,et al.  Handwritten digit recognition by combined classifiers , 1998, Kybernetika.

[19]  Ernest Valveny,et al.  Optimal Classifier Fusion in a Non-Bayesian Probabilistic Framework , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Wen Gao,et al.  Group-sensitive multiple kernel learning for object categorization , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[21]  Charles M. Grinstead,et al.  Introduction to probability , 1999, Statistics for the Behavioural Sciences.

[22]  Andrew Zisserman,et al.  A Visual Vocabulary for Flower Classification , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[23]  Dirk P. Kroese,et al.  Kernel density estimation via diffusion , 2010, 1011.2602.

[24]  Arun Ross,et al.  Handbook of Multibiometrics , 2006, The Kluwer international series on biometrics.

[25]  Gunnar Rätsch,et al.  Large Scale Multiple Kernel Learning , 2006, J. Mach. Learn. Res..

[26]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[27]  Anil K. Jain,et al.  Likelihood Ratio-Based Biometric Score Fusion , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Serge J. Belongie,et al.  Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[29]  Michael I. Jordan,et al.  Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.

[30]  Feller William,et al.  An Introduction To Probability Theory And Its Applications , 1950 .

[31]  Ayhan Demiriz,et al.  Linear Programming Boosting via Column Generation , 2002, Machine Learning.

[32]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[33]  Sebastian Nowozin,et al.  On feature combination for multiclass object classification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[34]  Maria L. Rizzo,et al.  Measuring and testing dependence by correlation of distances , 2007, 0803.4101.