Impressively fast and efficient KNN construction

K-Nearest-Neighbor (KNN) graphs have emerged as a fundamental building block of many on-line services such as recommendation, similarity search and classification. Constructing a KNN graph rapidly and accurately is however, a computationally intensive task. As data volumes keep growing, speed and the ability to scale out are becoming critical factors when deploying a KNN construction algorithm. In this work, we present KIFF, a generic, fast and scalable KNN graph construction algorithm. KIFF directly exploits the bipartite nature of most datasets to which KNN algorithms are applied. This novel strategy drastically limits the computational cost required to rapidly converge to an accurate KNN solution, especially for sparse datasets. We use a variety of datasets to experimentally prove that KIFF quickly computes a close approximation of the ideal KNN while reducing the computational cost compared to state-of-the-art approaches. KIFF provides, on average, a speed-up factor of 28 while improving the quality of the KNN approximation by 18%.