Average-Case Analysis of a Nearest Neighbor Algorithm

In this paper we present an average-case analysis of the nearest neighbor algorithm, a simple induction method that has been studied by many researchers. Our analysis assumes a conjunctive target concept, noise-free Boolean attributes, and a uniform distribution over the instance space. We calculate the probability that the algorithm will encounter a test instance that is distance d from the prototype of the concept, along with the probability that the nearest stored training case is distance e from this test instance. From this we compute the probability of correct classification as a function of the number of observed training cases, the number of relevant attributes, and the number of irrelevant attributes. We also explore the behavioral implications of the analysis by presenting predicted learning curves for artificial domains, and give experimental results on these domains as a check on our reasoning. Most learning methods form some abstraction from experience and store this structure in memory. The field has explored a wide range of such structures, including decision trees (Quinlan, 1986), multilayer networks (Rumelhart & McClelland, 1986), and probabilistic summaries (Fisher, 1987). However, in recent years there has been growing interest in methods that store instances or cases in memory, and that apply this specific knowledge directly to new situations. This approach goes by many names, including instance-based learning and case-based reasoning, and one can apply it to many different tasks. The simplest and most widely studied class of techniques , often called nearest neighbor algorithms, originated in the field of pattern recognition (Cover & Hart, 1967; Dasarathy, 1991) and applies to classification tasks. In the basic method, learning appears almost trivial-one simply stores each training instance in memory. The power of the method comes from the retrieval process. Given a new test instance, one finds the stored training case that is nearest according to some distance measure, notes the class of the retrieved case, and predicts the new instance will have the same class. Many variants exist on this basic algorithm. For instance , Stanfill and Waltz (1986) have studied a version that retrieves the k closest instances and bases predictions on a weighted vote, incorporating the distance of each stored instance from the test case; such techniques are often referred to as k-nearest neighbor algorithms. 1991) have studied an alternative approach that stores cases in memory only upon making an error, thus reducing memory load and retrieval time with …