Domain adaptation methods for robust pattern recognition

The large majority of classical and modern estimation techniques assume the data seen at the testing phase of statistical inference come from the same process that generated the training data. In many real-world applications this can be a restrictive assumption. We outline two solutions to overcome this heterogeneity: instance-weighting and dimension reduction. The instance-weighting methods estimate weights to use in a loss function in an attempt to make the weighted training distribution “look like” the testing distribution, whereas dimension reduction methods seek transformations of the training and testing data to place them both into a latent space where their distributions will be similar. We use synthetic datasets and a real data example to test the methods against one another.

[1]  Y. Chikuse Statistics on special manifolds , 2003 .

[2]  Daniel Marcu,et al.  Domain Adaptation for Statistical Classifiers , 2006, J. Artif. Intell. Res..

[3]  Yurii Nesterov,et al.  Interior-point polynomial algorithms in convex programming , 1994, Siam studies in applied mathematics.

[4]  H. Shimodaira,et al.  Improving predictive inference under covariate shift by weighting the log-likelihood function , 2000 .

[5]  R. Cook,et al.  Likelihood-Based Sufficient Dimension Reduction , 2009 .

[6]  K. Mardia,et al.  Projective Shape Analysis , 1999 .

[7]  K.A. Gallivan,et al.  Efficient algorithms for inferences on Grassmann manifolds , 2004, IEEE Workshop on Statistical Signal Processing, 2003.

[8]  Rama Chellappa,et al.  Domain adaptation for object recognition: An unsupervised approach , 2011, 2011 International Conference on Computer Vision.

[9]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[10]  Alan Edelman,et al.  The Geometry of Algorithms with Orthogonality Constraints , 1998, SIAM J. Matrix Anal. Appl..

[11]  C. Loader Bandwidth selection: classical or plug-in? , 1999 .

[12]  R. Cook,et al.  Sufficient dimension reduction and prediction in regression , 2009, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[13]  Yuan Shi,et al.  Geodesic flow kernel for unsupervised domain adaptation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  David W. Scott,et al.  Multivariate Density Estimation: Theory, Practice, and Visualization , 1992, Wiley Series in Probability and Statistics.

[15]  Gene H. Golub,et al.  Generalized cross-validation as a method for choosing a good ridge parameter , 1979, Milestones in Matrix Computation.

[16]  Bernhard Schölkopf,et al.  Correcting Sample Selection Bias by Unlabeled Data , 2006, NIPS.

[17]  Rama Chellappa,et al.  The role of geometry in age estimation , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[18]  M. Kawanabe,et al.  Direct importance estimation for covariate shift adaptation , 2008 .

[19]  D. W. Scott,et al.  Multivariate Density Estimation, Theory, Practice and Visualization , 1992 .