A Data-Driven Approach to Robust Hypothesis Testing Using Kernel MMD Uncertainty Sets

The problem of robust hypothesis testing is studied, where under the null and alternative hypotheses, data generating distributions are assumed to belong to some uncertainty sets. In this paper, uncertainty sets are constructed in a data-driven manner, i.e., they are centered around empirical distributions of training samples from the null and alternative hypotheses, respectively; and are constrained via the distance between kernel mean embeddings of distributions in the reproducing kernel Hilbert space. The Neyman-Pearson setting is investigated, where the goal is to minimize the worst-case probability of miss detection subject to the constraint on the worst-case probability of false alarm. An efficient robust kernel test is proposed and is further shown to be asymptotically optimal. Numerical results are further provided to demonstrate the performance of the proposed robust test.