Nonparametric Composite Hypothesis Testing in an Asymptotic Regime

We investigate the nonparametric, composite hypothesis testing problem for arbitrary unknown distributions in the asymptotic regime where both the sample size and the number of hypothesis grow exponentially large. Such asymptotic analysis is important in many practical problems, where the number of variations that can exist within a family of distributions can be countably infinite. We introduce the notion of discrimination capacity, which captures the largest exponential growth rate of the number of hypothesis relative to the sample size so that there exists a test with asymptotically vanishing probability of error. Our approach is based on various distributional distance metrics in order to incorporate the generative model of the data. We provide analyses of the error exponent using the maximum mean discrepancy and Kolmogorov–Smirnov distance and characterize the corresponding discrimination rates, i.e., lower bounds on the discrimination capacity, for these tests. Finally, an upper bound on the discrimination capacity based on Fano's inequality is developed. Numerical results are presented to validate the theoretical results.

[1]  Qing Wang,et al.  Divergence estimation of continuous distributions based on data-dependent partitions , 2005, IEEE Transactions on Information Theory.

[2]  A. Robert Calderbank,et al.  Rate-distortion bounds on Bayes risk in supervised learning , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[3]  A. Robert Calderbank,et al.  Discrimination on the grassmann manifold: Fundamental limits of subspace classifiers , 2014, 2014 IEEE International Symposium on Information Theory.

[4]  Neri Merhav,et al.  A competitive Neyman-Pearson approach to universal hypothesis testing with applications , 2002, IEEE Trans. Inf. Theory.

[5]  Joseph A. O'Sullivan,et al.  Achievable Rates for Pattern Recognition , 2005, IEEE Transactions on Information Theory.

[6]  Tara Javidi,et al.  Sequentiality and Adaptivity Gains in Active Hypothesis Testing , 2012, IEEE Journal of Selected Topics in Signal Processing.

[7]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[8]  Bernhard Schölkopf,et al.  Characteristic Kernels on Groups and Semigroups , 2008, NIPS.

[9]  Bernhard Schölkopf,et al.  Injective Hilbert Space Embeddings of Probability Measures , 2008, COLT.

[10]  John W. Fisher,et al.  Nonparametric hypothesis tests for statistical dependency , 2004, IEEE Transactions on Signal Processing.

[11]  Shun-ichi Amari,et al.  Statistical Inference Under Multiterminal Data Compression , 1998, IEEE Trans. Inf. Theory.

[12]  Adrià Tauste Campo,et al.  Bayesian $M$ -Ary Hypothesis Testing: The Meta-Converse and Verdú-Han Bounds Are Tight , 2014, IEEE Transactions on Information Theory.

[13]  P. Massart The Tight Constant in the Dvoretzky-Kiefer-Wolfowitz Inequality , 1990 .

[14]  Bernhard Schölkopf,et al.  Learning from Distributions via Support Measure Machines , 2012, NIPS.

[15]  Karl Woodbridge,et al.  Radar Micro-Doppler Signature Classification using Dynamic Time Warping , 2010, IEEE Transactions on Aerospace and Electronic Systems.

[16]  David Tse,et al.  A Minimax Approach to Supervised Learning , 2016, NIPS.

[17]  Imre Csiszár,et al.  Information Theory - Coding Theorems for Discrete Memoryless Systems, Second Edition , 2011 .

[18]  N. Merhav,et al.  A competitive Neyman-Pearson approach to universal hypothesis testing with applications , 2001, Proceedings. 2001 IEEE International Symposium on Information Theory (IEEE Cat. No.01CH37252).

[19]  Colin McDiarmid,et al.  Surveys in Combinatorics, 1989: On the method of bounded differences , 1989 .

[20]  R. Gallager Information Theory and Reliable Communication , 1968 .

[21]  Te Han,et al.  Hypothesis testing with multiterminal data compression , 1987, IEEE Trans. Inf. Theory.

[22]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[23]  Tze Leung Lai Sequential multiple hypothesis testing and efficient fault detection-isolation in stochastic systems , 2000, IEEE Trans. Inf. Theory.

[24]  Michael Gutman,et al.  Asymptotically optimal classification for multiple tests with empirically observed statistics , 1989, IEEE Trans. Inf. Theory.

[25]  Alexandre B. Tsybakov,et al.  Introduction to Nonparametric Estimation , 2008, Springer series in statistics.

[26]  Alexander J. Smola,et al.  Second Order Cone Programming Approaches for Handling Missing and Uncertain Data , 2006, J. Mach. Learn. Res..