论文信息 - Meta-Learning for Relative Density-Ratio Estimation - 字舞流文

Meta-Learning for Relative Density-Ratio Estimation

The ratio of two probability densities, called a density-ratio, is a vital quantity in machine learning. In particular, a relative density-ratio, which is a bounded extension of the density-ratio, has received much attention due to its stability and has been used in various applications such as outlier detection and dataset comparison. Existing methods for (relative) density-ratio estimation (DRE) require many instances from both densities. However, sufficient instances are often unavailable in practice. In this paper, we propose a meta-learning method for relative DRE, which estimates the relative density-ratio from a few instances by using knowledge in related datasets. Specifically, given two datasets that consist of a few instances, our model extracts the datasets’ information by using neural networks and uses it to obtain instance embeddings appropriate for the relative DRE. We model the relative density-ratio by a linear model on the embedded space, whose global optimum solution can be obtained as a closed-form solution. The closed-form solution enables fast and effective adaptation to a few instances, and its differentiability enables us to train our model such that the expected test error for relative DRE can be explicitly minimized after adapting to a few instances. We empirically demonstrate the effectiveness of the proposed method by using three problems: relative DRE, dataset comparison, and outlier detection.

Tomoharu Iwata | Atsutoshi Kumagai | Yasuhiro Fujiwara | Tomoharu Iwata | Y. Fujiwara | Atsutoshi Kumagai

[1] Takafumi Kanamori,et al. Relative Density-Ratio Estimation for Robust Distribution Comparison , 2011, Neural Computation.

[2] Mohamed Bekkar,et al. Evaluation Measures for Models Assessment over Imbalanced Data Sets , 2013 .

[3] Jayant Kalagnanam,et al. Multi-task Multi-modal Models for Collective Anomaly Detection , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[4] Masahiro Kato,et al. Learning from Positive and Unlabeled Data with a Selection Bias , 2018, ICLR.

[5] Michael U. Gutmann,et al. Telescoping Density-Ratio Estimation , 2020, NeurIPS.

[6] Masahiro Kato,et al. Non-Negative Bregman Divergence Minimization for Deep Direct Density Ratio Estimation , 2020, ICML.

[7] Yang Gao,et al. Multistream Classification with Relative Density Ratio Estimation , 2019, AAAI.

[8] Masashi Sugiyama,et al. Direct Approximation of Divergences Between Probability Distributions , 2013, Empirical Inference.

[9] Hiroshi Takahashi,et al. Variational Autoencoder with Implicit Optimal Priors , 2018, AAAI.

[10] Alexander J. Smola,et al. Deep Sets , 2017, 1703.06114.

[11] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[12] Karsten M. Borgwardt,et al. Covariate Shift by Kernel Mean Matching , 2009, NIPS 2009.

[13] Atsutoshi Kumagai,et al. Meta-learning from Tasks with Heterogeneous Attribute Spaces , 2020, NeurIPS.

[14] Hans-Peter Kriegel,et al. LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[15] M. Kawanabe,et al. Direct importance estimation for covariate shift adaptation , 2008 .

[16] Akiko Takeda,et al. Trimmed Density Ratio Estimation , 2017, NIPS.

[17] Mengjie Zhang,et al. Domain Generalization for Object Recognition with Multi-task Autoencoders , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[18] Masatoshi Uehara,et al. Generative Adversarial Nets from a Density Ratio Estimation Perspective , 2016, 1610.02920.

[19] Gang Niu,et al. Positive-Unlabeled Learning with Non-Negative Risk Estimator , 2017, NIPS.

[20] Sergey Levine,et al. Meta-Learning with Implicit Gradients , 2019, NeurIPS.

[21] Nigel Collier,et al. Change-Point Detection in Time-Series Data by Relative Density-Ratio Estimation , 2012, Neural Networks.

[22] Subhransu Maji,et al. Meta-Learning With Differentiable Convex Optimization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23] Takafumi Kanamori,et al. A Least-squares Approach to Direct Importance Estimation , 2009, J. Mach. Learn. Res..

[24] Takafumi Kanamori,et al. Statistical outlier detection using direct density ratio estimation , 2011, Knowledge and Information Systems.

[25] Masashi Sugiyama,et al. Anomaly Detection by Deep Direct Density Ratio Estimation , 2019 .

[26] H. Shimodaira,et al. Improving predictive inference under covariate shift by weighting the log-likelihood function , 2000 .

[27] Takafumi Kanamori,et al. Density Ratio Estimation in Machine Learning , 2012 .

[28] Tsuyoshi Murata,et al. {m , 1934, ACML.

[29] Tomoya Sakai,et al. Covariate Shift Adaptation on Learning from Positive and Unlabeled Data , 2019, AAAI.

[30] Yasuhiro Fujiwara,et al. Transfer Anomaly Detection by Inferring Latent Domain Representations , 2019, NeurIPS.

[31] Qiang Yang,et al. A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[32] Takehisa Yairi,et al. Anomaly Detection Using Autoencoders with Nonlinear Dimensionality Reduction , 2014, MLSDA'14.

[33] Tomoharu Iwata,et al. Learning Latest Classifiers without Additional Labeled Data , 2017, IJCAI.

[34] Carla E. Brodley,et al. Machine learning techniques for the computer security domain of anomaly detection , 2000 .

[35] Yee Whye Teh,et al. Conditional Neural Processes , 2018, ICML.

[36] Luca Bertinetto,et al. Meta-learning with differentiable closed-form solvers , 2018, ICLR.

[37] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .

[38] Masashi Sugiyama,et al. Rethinking Importance Weighting for Deep Learning under Distribution Shift , 2020, NeurIPS.

[39] Klaus-Robert Müller,et al. Covariate Shift Adaptation by Importance Weighted Cross Validation , 2007, J. Mach. Learn. Res..

[40] Changsheng Li,et al. Self-Paced Multi-Task Learning , 2016, AAAI.

[41] Richard S. Zemel,et al. Prototypical Networks for Few-shot Learning , 2017, NIPS.

[42] Alexander Binder,et al. Deep One-Class Classification , 2018, ICML.

[43] Vladimir Vapnik,et al. Statistical learning theory , 1998 .

[44] Y. Qin. Inferences for case-control and semiparametric two-sample density ratio models , 1998 .

[45] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[46] Yee Whye Teh,et al. Neural Processes , 2018, ArXiv.

[47] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[48] Gorjan Alagic,et al. #p , 2019, Quantum information & computation.

[49] Motoaki Kawanabe,et al. Direct Importance Estimation with Model Selection and Its Application to Covariate Shift Adaptation , 2007, NIPS.

[50] Steffen Bickel,et al. Discriminative learning for differing training and test distributions , 2007, ICML '07.

[51] Zhi-Hua Zhou,et al. Isolation Forest , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[52] Bernhard Schölkopf,et al. A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[53] Yee Whye Teh,et al. Set Transformer , 2018, ICML.

[54] Anthony Widjaja,et al. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.