Learning with a slowly changing distribution

In this paper, we consider the problem of learning a subset of a domain from randomly chosen examples when the probability distribution of the examples changes slowly but continually throughout the learning process. We give upper and lower bounds on the best achievable probability of misclassification after a given number of examples. If d is the VC-dimension of the target function class, t is the number of examples, and &Ugr; is the amount by which the distribution is allowed to change (measured by the largest change in the probability of a subset of the domain), the upper bound decreases as d/t initially, and settles to O(d2/3&Ugr;1/2) for large t. These bounds give necessary and sufficient conditions on &Ugr;, the rate of change of the distribution of examples, to ensure that some learning algorithm can produce an acceptably small probability of misclassification. We also consider the case of learning a near-optimal subset of the domain when the examples and their labels are generated by a joint probability distribution on the example and label spaces. We give an upper bound on &Ugr; that ensures learning is possible from a finite number of examples.