Constrained Least-Squares Density-Difference Estimation

We address the problem of estimating the difference between two probability den- sities. A naive approach is a two-step procedure thatrst estimates two densities separately and then computes their difference. However, such a two-step procedure does not necessarily work well because therst step is performed without regard to the second step and thus a small error in therst stage can cause a big error in the second stage. Recently, a single-shot method called the least-squares density- difference (LSDD) estimator has been proposed. LSDD directly estimates the den- sity difference without separately estimating two densities, and it was demonstrated to outperform the two-step approach. In this paper, we propose a variation of LSDD called the constrained least-squares density-difference (CLSDD) estimator, and the- oretically prove that CLSDD improves the accuracy of density difference estimation for correctly specied parametric models. The usefulness of the proposed method is also demonstrated experimentally.

[1]  Sugiyama Masashi,et al.  Salient Object Detection Based on Direct Density-Ratio Estimation , 2012 .

[2]  Masashi Sugiyama,et al.  Sequential change‐point detection based on direct density‐ratio estimation , 2012, Stat. Anal. Data Min..

[3]  Masashi Sugiyama,et al.  Change-point detection in time-series data by relative density-ratio estimation , 2012 .

[4]  Masashi Sugiyama,et al.  Semi-Supervised Learning of Class Balance under Class-Prior Change by Distribution Matching , 2012, ICML.

[5]  Marco Saerens,et al.  Adjusting the Outputs of a Classifier to New a Priori Probabilities: A Simple Procedure , 2002, Neural Computation.

[6]  Masashi Sugiyama,et al.  Sufficient Dimension Reduction via Squared-Loss Mutual Information Estimation , 2010, Neural Computation.

[7]  Bernhard Schölkopf,et al.  Introduction to Semi-Supervised Learning , 2006, Semi-Supervised Learning.

[8]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[9]  Nigel Collier,et al.  Change-Point Detection in Time-Series Data by Relative Density-Ratio Estimation , 2012, Neural Networks.

[10]  David M. Gray,et al.  Quadratic mutual information for dimensionality reduction and classification , 2010, Defense + Commercial Sensing.

[11]  KawaharaYoshinobu,et al.  Sequential change-point detection based on direct density-ratio estimation , 2012 .

[12]  Takafumi Kanamori,et al.  Density-Difference Estimation , 2012, Neural Computation.

[13]  Yi Lin,et al.  Support Vector Machines for Classification in Nonstandard Situations , 2002, Machine Learning.

[14]  Kari Torkkola,et al.  Feature Extraction by Non-Parametric Mutual Information Maximization , 2003, J. Mach. Learn. Res..

[15]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[16]  Jing Peng,et al.  SVM vs regularized least squares classification , 2004, ICPR 2004.

[17]  Xianglong Tang,et al.  Probability density difference-based active contour for ultrasound image segmentation , 2010, Pattern Recognit..

[18]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[19]  Arthur Albert,et al.  Regression and the Moore-Penrose Pseudoinverse , 2012 .

[20]  Masashi Sugiyama,et al.  Detection of activities and events without explicit categorization , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[21]  Calyampudi R. Rao,et al.  Linear statistical inference and its applications , 1965 .

[22]  T. Poggio,et al.  Regularized Least-Squares Classification 133 In practice , although , 2007 .

[23]  Charles Elkan,et al.  The Foundations of Cost-Sensitive Learning , 2001, IJCAI.

[24]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .