Application of Classical Nonparametric Predictors to Learning Conditionally I.I.D. Data

In this work we consider the task of pattern recognition under the assumption that examples are conditionally independent. Pattern recognition is predicting a sequence of labels based on objects given for each label and on examples (pairs of objects and labels) learned so far. Traditionally, this task is considered under the assumption that examples are independent and identically distributed (i.i.d). We show that some classical nonparametric predictors originally developed to work under the (strict) i.i.d. assumption, retain their consistency under a weaker assumption of conditional independence. By conditional independence we mean that objects are distributed identically and independently given their labels, while the only condition on the distribution of labels is that the rate of occurrence of each label does not tend to zero. The predictors we consider are partitioning and nearest neighbour estimates.