Fast distributed k-nn graph update

In this paper, we present an approximate algorithm that is able to quickly modify a large distributed fc-nn graph by adding or removing nodes. The algorithm produces an approximate graph that is highly similar to the graph computed using a naïve approach, although it requires the computation of far fewer similarities. To achieve this goal, it relies on a novel, distributed graph based search procedure. All these algorithms are also experimentally evaluated, using both euclidean and non-euclidean datasets.

[1]  Kai Li,et al.  Efficient k-nearest neighbor graph construction for generic similarity measures , 2011, WWW.

[2]  Pasi Fränti,et al.  Balanced K-Means for Clustering , 2014, S+SSPR.

[3]  Sergios Theodoridis,et al.  Pattern Recognition, Third Edition , 2006 .

[4]  Gabriel Kliot,et al.  Streaming graph partitioning for large distributed graphs , 2012, KDD.

[5]  Amir H. Payberah,et al.  JA-BE-JA: A Distributed Algorithm for Balanced Graph Partitioning , 2013, 2013 IEEE 7th International Conference on Self-Adaptive and Self-Organizing Systems.

[6]  Tim Althoff,et al.  Balanced Clustering for Content-based Image Browsing , 2011, Informatiktage.

[7]  Hae-Sang Park,et al.  A simple and fast algorithm for K-medoids clustering , 2009, Expert Syst. Appl..

[8]  Joydeep Ghosh,et al.  Frequency sensitive competitive learning for clustering on high-dimensional hyperspheres , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[9]  Laura Ricci,et al.  Balanced Graph Partitioning with Apache Spark , 2014, Euro-Par Workshops.

[10]  Yasin Abbasi-Yadkori,et al.  Fast Approximate Nearest-Neighbor Search with k-Nearest Neighbor Graph , 2011, IJCAI.

[11]  Joydeep Ghosh,et al.  Frequency-sensitive competitive learning for scalable balanced clustering on high-dimensional hyperspheres , 2004, IEEE Transactions on Neural Networks.