Minimax Regression via Adaptive Nearest Neighbor

In this paper, we investigate the convergence rate of k Nearest Neighbor (kNN) regression methods. We first derive the minimax bound for nonparametric regression under some general tail and smoothness assumptions. This bound shows that, when the distribution of features has heavy tails, there is a gap between this minimax bound and that can be achieved by the standard kNN methods where the same k is used for all query points. To close this gap, we propose an adaptive kNN method that selects smaller k when the query sample falls in the region with lower density, and vice versa. As the density function is unknown, we design a simple method to determine the value of k from training samples. Using this selection rule, we obtain a desirable tradeoff between bias and variance. Furthermore, we show that the convergence rate of our new regression method attains the minimax lower bound and hence is rate optimal when the underlying regression function is bounded. We further extend the analysis to the case with unbounded underlying regression functions and show that the proposed method significantly outperforms the standard kNN regression method in this case as well.