Parallel Construction of Approximate kNN Graph

Building k-nearest neighbor (kNN) graphs is a necessary step in such areas as data mining and machine learning. So in this paper, we attempt to study the kNN furthermore, we first propose a parallel algorithm for approximate kNN graph construction and then apply the kNN graph to the application of clustering. Experiments show that our MPI/OpenMP mixed mode codes can make the construction of approximate kNN graph faster and make the parallelization and implementation easier. Finally, we compare the results of agglomerative clustering methods by using our parallel algorithm to illustrate the applicability of this method.

[1]  Vipin Kumar,et al.  Chameleon: Hierarchical Clustering Using Dynamic Modeling , 1999, Computer.

[2]  Anh-Vu Pham,et al.  Development of three dimensional ceramic-based MCM inductors for hybrid RF/microwave applications , 1999, 1999 IEEE Radio Frequency Integrated Circuits Symposium (Cat No.99CH37001).

[3]  Mark Bull,et al.  Development of mixed mode MPI / OpenMP applications , 2001, Sci. Program..

[4]  P. Kumar,et al.  Parallel Construction of k-Nearest Neighbor Graphs for Point Clouds , 2008, VG/PBG@SIGGRAPH.

[5]  Lydia E. Kavraki,et al.  Distributed computation of the knn graph for large high-dimensional point sets , 2007, J. Parallel Distributed Comput..

[6]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[7]  Ignacio Blanquer,et al.  A Parallel Implementation of the K Nearest Neighbours Classifier in Three Levels: Threads, MPI Processes and the Grid , 2006, VECPAR.

[8]  Alejandro Duran,et al.  A Proposal for Task Parallelism in OpenMP , 2007, IWOMP.

[9]  Pasi Fränti,et al.  Fast Agglomerative Clustering Using a k-Nearest Neighbor Graph , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Yousef Saad,et al.  Fast Approximate kNN Graph Construction for High Dimensional Data via Recursive Lanczos Bisection , 2009, J. Mach. Learn. Res..

[11]  P. Fränti,et al.  Graph-based agglomerative clustering , 2003 .

[12]  Howie Choset,et al.  Principles of Robot Motion: Theory, Algorithms, and Implementation ERRATA!!!! 1 , 2007 .