论文信息 - P-values for classification

P-values for classification

Let $(X,Y)$ be a random variable consisting of an observed feature vector $X\in \mathcal{X}$ and an unobserved class label $Y\in \{1,2,...,L\}$ with unknown joint distribution. In addition, let $\mathcal{D}$ be a training data set consisting of $n$ completely observed independent copies of $(X,Y)$. Usual classification procedures provide point predictors (classifiers) $\widehat{Y}(X,\mathcal{D})$ of $Y$ or estimate the conditional distribution of $Y$ given $X$. In order to quantify the certainty of classifying $X$ we propose to construct for each $\theta =1,2,...,L$ a p-value $\pi_{\theta}(X,\mathcal{D})$ for the null hypothesis that $Y=\theta$, treating $Y$ temporarily as a fixed parameter. In other words, the point predictor $\widehat{Y}(X,\mathcal{D})$ is replaced with a prediction region for $Y$ with a certain confidence. We argue that (i) this approach is advantageous over traditional approaches and (ii) any reasonable classifier can be modified to yield nonparametric p-values. We discuss issues such as optimality, single use and multiple use validity, as well as computational and graphical aspects.

[1] S. Yakowitz,et al. On the Identifiability of Finite Mixtures , 1968 .

[2] Geoffrey J. McLachlan,et al. Robust mixture modelling using the t distribution , 2000, Stat. Comput..

[3] Hajo Holzmann,et al. Identifiability of Finite Mixtures of Elliptical Distributions , 2006 .

[4] J. Wellner,et al. Empirical Processes with Applications to Statistics , 2009 .

[5] R. Fisher. THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[6] Hajo Holzmann,et al. Identifiability of finite mixtures - with applications to circular distributions , 2004 .

[7] C. J. Stone,et al. Consistent Nonparametric Regression , 1977 .

[8] G. McLachlan. Discriminant Analysis and Statistical Pattern Recognition , 1992 .

[9] Power robustification of approximately linear tests , 1995 .

[10] Yoshua Bengio,et al. Pattern Recognition and Neural Networks , 1995 .

[11] Adrian E. Raftery,et al. Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[12] H. Fédérer. Geometric Measure Theory , 1969 .

[13] W. J. Whiten,et al. Fitting Mixtures of Kent Distributions to Aid in Joint Set Identification , 2001 .