Affinity Propagation: Clustering Data by Passing Messages

AFFINITY PROPAGATION: CLUSTERING DATA BY PASSING MESSAGES Delbert Dueck Doctor of Philosophy Graduate Department of Electrical & Computer Engineering University of Toronto 2009 Clustering data by identifying a subset of representative examples is important for detecting patterns in data and in processing sensory signals. Such “exemplars” can be found by randomly choosing an initial subset of data points as exemplars and then iteratively refining it, but this works well only if that initial choice is close to a good solution. This thesis describes a method called “affinity propagation” that simultaneously considers all data points as potential exemplars, exchanging real-valued messages between data points until a high-quality set of exemplars and corresponding clusters gradually emerges. Affinity propagation takes as input a set of pairwise similarities between data points and finds clusters on the basis of maximizing the total similarity between data points and their exemplars. Similarity can be simply defined as negative squared Euclidean distance for compatibility with other algorithms, or it can incorporate richer domain-specific models ( e.g., translation-invariant distances for comparing images). Affinity propagation’s computational and memory requirements scale linearly with the number of similarities input; for non-sparse problems where all possible similarities are computed, these requirements scale quadratically with the number of data points. Affinity propagation is demonstrated on several applications from areas such as computer vision and bioinformatics, and it typically finds better clustering solutions than other methods in less time.

[1]  Jianxiong Xiao,et al.  Joint Affinity Propagation for Multiple View Segmentation , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[2]  L. G. H. Cijan A polynomial algorithm in linear programming , 1979 .

[3]  Hans-Friedrich Köhn,et al.  Comment on "Clustering by Passing Messages Between Data Points" , 2008, Science.

[4]  Vladimir Jojic,et al.  Algorithms for rational vaccine design , 2007 .

[5]  Pedro Larrañaga,et al.  Learning Factorizations in Estimation of Distribution Algorithms Using Affinity Propagation , 2010, Evolutionary Computation.

[6]  Sung Yong Shin,et al.  On pixel-based texture synthesis by non-parametric sampling , 2006, Comput. Graph..

[7]  William T. Freeman,et al.  Learning to Estimate Scenes from Images , 1998, NIPS.

[8]  J. Laurie Snell,et al.  Markov Random Fields and Their Applications , 1980 .

[9]  O. Kariv,et al.  An Algorithmic Approach to Network Location Problems. II: The p-Medians , 1979 .

[10]  Eric V. Denardo,et al.  Flows in Networks , 2011 .

[11]  Polina Golland,et al.  Convex Clustering with Exemplar-Based Models , 2007, NIPS.

[12]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[13]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[14]  David B. Shmoys,et al.  A Best Possible Heuristic for the k-Center Problem , 1985, Math. Oper. Res..

[15]  Jung-Fu Cheng,et al.  Turbo Decoding as an Instance of Pearl's "Belief Propagation" Algorithm , 1998, IEEE J. Sel. Areas Commun..

[16]  A. Glavieux,et al.  Near Shannon limit error-correcting coding and decoding: Turbo-codes. 1 , 1993, Proceedings of ICC '93 - IEEE International Conference on Communications.

[17]  Geoffrey E. Hinton,et al.  Using Pairs of Data-Points to Define Splits for Decision Trees , 1995, NIPS.

[18]  Brendan J. Frey,et al.  Mixture Modeling by Affinity Propagation , 2005, NIPS.

[19]  Vladimir Kolmogorov,et al.  What energy functions can be minimized via graph cuts? , 2002, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Svetha Venkatesh,et al.  Acquiring Critical Light Points for Illumination Subspaces of Face Images by Affinity Propagation Clustering , 2007, PCM.

[21]  S. Mallal,et al.  The Western Australian HIV Cohort Study, Perth, Australia. , 1998, Journal of acquired immune deficiency syndromes and human retrovirology : official publication of the International Retrovirology Association.

[22]  Yueting Zhuang,et al.  Cross-modal correlation learning for clustering on image-audio dataset , 2007, ACM Multimedia.

[23]  Feng Li,et al.  Integrating affinity propagation clustering method with linear discriminant analysis for face recognition , 2007 .

[24]  Dariu Gavrila,et al.  Real-time object detection for "smart" vehicles , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[25]  F. E. Maranzana,et al.  On the Location of Supply Points to Minimize Transport Costs , 1964 .

[26]  Peng Wang,et al.  On Detecting Subtle Pathology via Tissue Clustering of Multi-parametric Data using Affinity Propagation , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[27]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[28]  Eli Shechtman,et al.  Space-time video completion , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[29]  Sebastian Nowozin,et al.  A decoupled approach to exemplar-based unsupervised learning , 2008, ICML '08.

[30]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[31]  Alexander Senf,et al.  A Statistical Algorithm to Discover Knowledge in Medical Data Sources , 2007, Sixth International Conference on Machine Learning and Applications (ICMLA 2007).

[32]  Peter H. A. Sneath,et al.  Numerical Taxonomy: The Principles and Practice of Numerical Classification , 1973 .

[33]  Brendan J. Frey,et al.  Epitomic analysis of appearance and shape , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[34]  Yair Weiss,et al.  Linear Programming Relaxations and Belief Propagation - An Empirical Study , 2006, J. Mach. Learn. Res..

[35]  Norman L. Letvin,et al.  Coping with Viral Diversity in HIV Vaccine Design: A Response to Nickle et al , 2008, PLoS Comput. Biol..

[36]  Marc Mézard,et al.  1993 , 1993, The Winning Cars of the Indianapolis 500.

[37]  Pierre Hansen,et al.  The p-median problem: A survey of metaheuristic approaches , 2005, Eur. J. Oper. Res..

[38]  P. Hansen,et al.  Variable neighborhood search for the p-median , 1997 .

[39]  Michael I. Jordan,et al.  Loopy Belief Propagation for Approximate Inference: An Empirical Study , 1999, UAI.

[40]  Sanjoy Dasgupta,et al.  A Probabilistic Analysis of EM for Mixtures of Separated, Spherical Gaussians , 2007, J. Mach. Learn. Res..

[41]  V. Cerný Thermodynamical approach to the traveling salesman problem: An efficient simulation algorithm , 1985 .

[42]  G. Nemhauser,et al.  Exceptional Paper—Location of Bank Accounts to Optimize Float: An Analytic Study of Exact and Approximate Algorithms , 1977 .

[43]  Brendan J. Frey,et al.  Unwrapping phase images by propagating probabilities across graphs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[44]  J. Besag On the Statistical Analysis of Dirty Pictures , 1986 .

[45]  M. Mézard,et al.  Analytic and Algorithmic Solution of Random Satisfiability Problems , 2002, Science.

[46]  Yonghong Yan,et al.  A novel speaker clustering algorithm via supervised affinity propagation , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[47]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48]  George L. Nemhauser,et al.  Note--On "Location of Bank Accounts to Optimize Float: An Analytic Study of Exact and Approximate Algorithms" , 1979 .

[49]  Jeffrey Scott Vitter,et al.  Approximation Algorithms for Geometric Median Problems , 1992, Inf. Process. Lett..

[50]  Brendan J. Frey,et al.  Non-metric affinity propagation for unsupervised image categorization , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[51]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[52]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[53]  Haesun Park,et al.  Sparse Nonnegative Matrix Factorization for Clustering , 2008 .

[54]  Yaron Caspi,et al.  Under the supervision of , 2003 .

[55]  Brendan J. Frey,et al.  Flexible Priors for Exemplar-based Clustering , 2008, UAI.

[56]  Geoffrey E. Hinton,et al.  Split and Merge EM Algorithm for Improving Gaussian Mixture Density Estimates , 2000, J. VLSI Signal Process..

[57]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[58]  Olga Veksler,et al.  Fast approximate energy minimization via graph cuts , 2001, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[59]  S. Hakimi Optimum Distribution of Switching Centers in a Communication Network and Some Related Graph Theoretic Problems , 1965 .

[60]  D. R. Fulkerson,et al.  Maximal Flow Through a Network , 1956 .

[61]  Brendan J. Frey,et al.  Factor graphs and the sum-product algorithm , 2001, IEEE Trans. Inf. Theory.

[62]  Brendan J. Frey,et al.  Using ``epitomes'' to model genetic diversity: Rational design of HIV vaccine cocktails , 2005, NIPS 2005.

[63]  Nigel J. Martin,et al.  Gene3D: comprehensive structural and functional annotation of genomes , 2007, Nucleic Acids Res..

[64]  Michele Leone,et al.  Clustering by Soft-constraint Affinity Propagation: Applications to Gene-expression Data , 2022 .

[65]  H. Bethe Statistical Theory of Superlattices , 1935 .

[66]  Tatiana A. Tatusova,et al.  NCBI Reference Sequence Project: update and current status , 2003, Nucleic Acids Res..

[67]  David Heckerman,et al.  Coping with Viral Diversity in HIV Vaccine Design , 2007, PLoS Comput. Biol..

[68]  Narendra Karmarkar,et al.  A new polynomial-time algorithm for linear programming , 1984, Comb..

[69]  Sven Rahmann,et al.  Large scale clustering of protein sequences with FORCE -A layout based heuristic for weighted cluster editing , 2007, BMC Bioinformatics.

[70]  Sanjoy Dasgupta,et al.  A Two-Round Variant of EM for Gaussian Mixtures , 2000, UAI.

[71]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[72]  R. A. Whitaker,et al.  A Fast Algorithm For The Greedy Interchange For Large-Scale Clustering And Median Location Problems , 1983 .

[73]  Yuxin Peng,et al.  Color-based clustering for text detection and extraction in image , 2007, ACM Multimedia.

[74]  Tommi S. Jaakkola,et al.  Tightening LP Relaxations for MAP using Message Passing , 2008, UAI.

[75]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[76]  M. F.,et al.  Bibliography , 1985, Experimental Gerontology.

[77]  Michèle Sebag,et al.  Data Streaming with Affinity Propagation , 2008, ECML/PKDD.

[78]  J. C. Schlimmer,et al.  Concept acquisition through representational adjustment , 1987 .

[79]  S. L. HAKIMIt AN ALGORITHMIC APPROACH TO NETWORK LOCATION PROBLEMS. , 1979 .

[80]  Tao Guo,et al.  Adaptive Affinity Propagation Clustering , 2008, ArXiv.

[81]  Sudipto Guha,et al.  A constant-factor approximation algorithm for the k-median problem (extended abstract) , 1999, STOC '99.

[82]  W. Freeman,et al.  Generalized Belief Propagation , 2000, NIPS.

[83]  Pablo M. Granitto,et al.  ISOMAP based metrics for clustering , 2008, Inteligencia Artif..

[84]  Robert J. McEliece,et al.  The generalized distributive law , 2000, IEEE Trans. Inf. Theory.

[85]  Brendan J. Frey,et al.  Finding Novel Transcripts in High-Resolution Genome-Wide Microarray Data Using the GenRate Model , 2005, RECOMB.

[86]  Brendan J. Frey,et al.  Response to Comment on "Clustering by Passing Messages Between Data Points" , 2008, Science.

[87]  Tomaso A. Poggio,et al.  Image Synthesis from a Single Example Image , 1996, ECCV.

[88]  S. Dongen Graph clustering by flow simulation , 2000 .

[89]  Adrienne Chu,et al.  A Model-Based Analysis of Chemical and Temporal Patterns of Cuticular Hydrocarbons in Male Drosophila melanogaster , 2007, PloS one.

[90]  Ralph E. Gomory,et al.  Outline of an Algorithm for Integer Solutions to Linear Programs and An Algorithm for the Mixed Integer Problem , 2010, 50 Years of Integer Programming.

[91]  Alfred A. Kuehn,et al.  A Heuristic Program for Locating Warehouses , 1963 .

[92]  Simon Mallal,et al.  The Western Australian HIV cohort study , 1998 .

[93]  Yair Weiss,et al.  Segmentation using eigenvectors: a unifying view , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[94]  Leon Cooper,et al.  SOLUTIONS OF GENERALIZED LOCATIONAL EQUILIBRIUM MODELS , 1967 .

[95]  Andrew Blake,et al.  Probabilistic Tracking with Exemplars in a Metric Space , 2002, International Journal of Computer Vision.

[96]  Ralph E. Gomory,et al.  An algorithm for integer solutions to linear programs , 1958 .

[97]  Tianyi Jiang,et al.  Dynamic micro-targeting: fitness-based approach to predicting individual preferences , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[98]  Inderjit S. Dhillon,et al.  Clustering with Bregman Divergences , 2005, J. Mach. Learn. Res..

[99]  Benjamin King Step-Wise Clustering Procedures , 1967 .

[100]  Zhu Li,et al.  Laplacian Affinity Propagation for Semi-Supervised Object Classification , 2007, 2007 IEEE International Conference on Image Processing.

[101]  L. Khachiyan Polynomial algorithms in linear programming , 1980 .

[102]  Dmitry M. Malioutov,et al.  Linear programming analysis of loopy belief propagation for weighted matching , 2007, NIPS.

[103]  B. Frey,et al.  Genome-wide analysis of mouse transcripts using exon microarrays and factor graphs , 2005, Nature Genetics.

[104]  Fei Ding,et al.  An affinity propagation based method for vector quantization codebook design , 2008, 2008 19th International Conference on Pattern Recognition.

[105]  Michael E. Houle,et al.  Best of both: a hybridized centroid-medoid clustering heuristic , 2007, ICML '07.

[106]  S. L. Hakimi,et al.  Optimum Locations of Switching Centers and the Absolute Centers and Medians of a Graph , 1964 .

[107]  Charles ReVelle,et al.  Central Facilities Location , 2010 .

[108]  Polly Bart,et al.  Heuristic Methods for Estimating the Generalized Vertex Median of a Weighted Graph , 1968, Oper. Res..

[109]  L. Cooper Location-Allocation Problems , 1963 .

[110]  Jianbo Shi,et al.  Learning Segmentation by Random Walks , 2000, NIPS.

[111]  Brendan J. Frey,et al.  Constructing Treatment Portfolios Using Affinity Propagation , 2008, RECOMB.

[112]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[113]  Leon Cooper,et al.  Heuristic Methods for Location-Allocation Problems , 1964 .

[114]  D. Greig,et al.  Exact Maximum A Posteriori Estimation for Binary Images , 1989 .

[115]  T. L. Ray,et al.  Warehouse Location Under Continuous Economies of Scale , 1966 .