Efficiently Learning a Distance Metric for Large Margin Nearest Neighbor Classification

We concern the problem of learning a Mahalanobis distance metric for improving nearest neighbor classification. Our work is built upon the large margin nearest neighbor (LMNN) classification framework. Due to the semidefiniteness constraint in the optimization problem of LMNN, it is not scalable in terms of the dimensionality of the input data. The original LMNN solver partially alleviates this problem by adopting alternating projection methods instead of standard interior-point methods. Still, at each iteration, the computation complexity is at least O(D3) (D is the dimension of input data). In this work, we propose a column generation based algorithm to solve the LMNN optimization problem much more efficiently. Our algorithm is much more scalable in that at each iteration, it does not need full eigen-decomposition. Instead, we only need to find the leading eigenvalue and its corresponding eigenvector, which is of O(D2) complexity. Experiments show the efficiency and efficacy of our algorithms.