Unsupervised domain adaptation with copula models

We study the task of unsupervised domain adaptation, where no labeled data from the target domain is provided during training time. To deal with the potential discrepancy between the source and target distributions, both in features and labels, we exploit a copula-based regression framework. The benefits of this approach are two-fold: (a) it allows us to model a broader range of conditional predictive densities beyond the common exponential family; (b) we show how to leverage Sklar's theorem, the essence of the copula formulation relating the joint density to the copula dependency functions, to find effective feature mappings that mitigate the domain mismatch. By transforming the data to a copula domain, we show on a number of benchmark datasets (including human emotion estimation), and using different regression models for prediction, that we can achieve a more robust and accurate estimation of target labels, compared to recently proposed feature transformation (adaptation) methods.

[1]  Ivor W. Tsang,et al.  Domain Adaptation via Transfer Component Analysis , 2009, IEEE Transactions on Neural Networks.

[2]  Brian C. Lovell,et al.  Unsupervised Domain Adaptation by Domain Invariant Projection , 2013, 2013 IEEE International Conference on Computer Vision.

[3]  Alireza Bayestehtashk,et al.  Robust speech recognition using multivariate copula models , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[4]  William Yang Wang,et al.  A Semiparametric Gaussian Copula Regression Model for Predicting Financial Risks from Earnings Calls , 2014, ACL.

[5]  Mehrtash Tafazzoli Harandi,et al.  Learning an Invariant Hilbert Space for Domain Adaptation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Le Song,et al.  A Kernel Statistical Test of Independence , 2007, NIPS.

[7]  R. Nelsen An Introduction to Copulas (Springer Series in Statistics) , 2006 .

[8]  Kari Torkkola,et al.  Feature Extraction by Non-Parametric Mutual Information Maximization , 2003, J. Mach. Learn. Res..

[9]  Abe Sklar,et al.  Random variables, joint distribution functions, and copulas , 1973, Kybernetika.

[10]  Takafumi Kanamori,et al.  Efficient Direct Density Ratio Estimation for Non-stationarity Adaptation and Outlier Detection , 2008, NIPS.

[11]  Frank P. Ferrie,et al.  A Note on Metric Properties for Some Divergence Measures: The Gaussian Case , 2012, ACML.

[13]  Kate Saenko,et al.  Return of Frustratingly Easy Domain Adaptation , 2015, AAAI.

[14]  H. Shimodaira,et al.  Improving predictive inference under covariate shift by weighting the log-likelihood function , 2000 .

[15]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[16]  Daumé,et al.  Frustratingly Easy Semi-Supervised Domain Adaptation , 2010 .

[17]  R. Ibragimov,et al.  Copula Estimation , 2009 .

[18]  Bamdev Mishra,et al.  Manopt, a matlab toolbox for optimization on manifolds , 2013, J. Mach. Learn. Res..

[19]  Bernhard Schölkopf,et al.  Semi-Supervised Domain Adaptation with Non-Parametric Copulas , 2012, NIPS.

[20]  Masashi Sugiyama,et al.  Covariate shift adaptation for semi-supervised speaker identification , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[21]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[22]  Fabien Ringeval,et al.  AVEC 2016: Depression, Mood, and Emotion Recognition Workshop and Challenge , 2016, AVEC@ACM Multimedia.