Bisect and Conquer: Hierarchical Clustering via Max-Uncut Bisection

Hierarchical Clustering is an unsupervised data analysis method which has been widely used for decades. Despite its popularity, it had an underdeveloped analytical foundation and to address this, Dasgupta recently introduced an optimization viewpoint of hierarchical clustering with pairwise similarity information that spurred a line of work shedding light on old algorithms (e.g., Average-Linkage), but also designing new algorithms. Here, for the maximization dual of Dasgupta's objective (introduced by Moseley-Wang), we present polynomial-time .4246 approximation algorithms that use Max-Uncut Bisection as a subroutine. The previous best worst-case approximation factor in polynomial time was .336, improving only slightly over Average-Linkage which achieves 1/3. Finally, we complement our positive results by providing APX-hardness (even for 0-1 similarities), under the Small Set Expansion hypothesis.

[1]  Prasad Raghavendra,et al.  Reductions between Expansion Problems , 2010, 2012 IEEE 27th Conference on Computational Complexity.

[2]  Moses Charikar,et al.  Approximate Hierarchical Clustering via Sparsest Cut and Spreading Metrics , 2016, SODA.

[3]  Robert D. Nowak,et al.  Active Clustering: Robust and Efficient Hierarchical Clustering using Adaptively Selected Similarities , 2011, AISTATS.

[4]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Benjamin Moseley,et al.  Approximation Bounds for Hierarchical Clustering: Average Linkage, Bisecting K-means, and Local Search , 2017, NIPS.

[6]  Moses Charikar,et al.  Hierarchical Clustering better than Average-Linkage , 2019, SODA.

[7]  R. Rosenfeld Nature , 2009, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[8]  Claire Cardie,et al.  Clustering with Instance-Level Constraints , 2000, AAAI/IAAI.

[9]  C. Greg Plaxton,et al.  Approximation algorithms for hierarchical location problems , 2003, STOC '03.

[10]  Konstantinos Georgiou,et al.  Better Balance by Being Biased , 2016, ACM Trans. Algorithms.

[11]  Prasad Raghavendra,et al.  Graph expansion and the unique games conjecture , 2010, STOC '10.

[12]  Claire Cardie,et al.  Proceedings of the Eighteenth International Conference on Machine Learning, 2001, p. 577–584. Constrained K-means Clustering with Background Knowledge , 2022 .

[13]  Claire Mathieu,et al.  Hierarchical Clustering , 2019, Journal of the ACM.

[14]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[15]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[16]  Dachuan Xu,et al.  An improved semidefinite programming hierarchies rounding approximation algorithm for maximum graph bisection problems , 2013, J. Comb. Optim..

[17]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[18]  Silvio Lattanzi,et al.  Affinity Clustering: Hierarchical Clustering at Scale , 2017, NIPS.

[19]  Din J. Wasem,et al.  Mining of Massive Datasets , 2014 .

[20]  Sanjoy Dasgupta,et al.  Interactive Bayesian Hierarchical Clustering , 2016, ICML.

[21]  K. Roeder,et al.  Journal of the American Statistical Association: Comment , 2006 .

[22]  Peter Sanders,et al.  KaHIP v0.53 - Karlsruhe High Quality Partitioning - User Guide , 2013, ArXiv.

[23]  David Kempe,et al.  Adaptive Hierarchical Clustering Using Ordinal Queries , 2017, SODA.

[24]  Rajmohan Rajaraman,et al.  A general approach for incremental approximation and hierarchical clustering , 2006, SODA '06.

[25]  Konstantinos Georgiou,et al.  Better Balance by Being Biased: A 0.8776-Approximation for Max Bisection , 2012, SODA.

[26]  Grigory Yaroslavtsev,et al.  Massively Parallel Algorithms and Hardness for Single-Linkage Clustering Under $\ell_p$-Distances , 2017, ICML.

[27]  Maria-Florina Balcan,et al.  Local algorithms for interactive clustering , 2013, ICML.

[28]  M. A. Muñoz,et al.  A novel brain partition highlights the modular skeleton shared by structure and function , 2014, Scientific Reports.

[29]  Sanjoy Dasgupta,et al.  A cost function for similarity-based hierarchical clustering , 2015, STOC.

[30]  George Karypis,et al.  A Comparison of Document Clustering Techniques , 2000 .

[31]  J. Gower,et al.  Minimum Spanning Trees and Single Linkage Cluster Analysis , 1969 .

[32]  AroraSanjeev,et al.  Geometry, flows, and graph-partitioning algorithms , 2008 .

[33]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[34]  Fabrizio Lillo,et al.  Correlation, Hierarchies, and Networks in Financial Markets , 2008, 0809.4615.

[35]  Rajeev Motwani,et al.  Incremental clustering and dynamic information retrieval , 1997, STOC '97.

[36]  R. Fonck,et al.  Flows ! , 2003 .

[37]  Grigory Yaroslavtsev,et al.  Hierarchical Clustering for Euclidean Data , 2018, AISTATS.

[38]  Claire Mathieu,et al.  Hierarchical Clustering , 2017, SODA.

[39]  Philip M. Long,et al.  Performance guarantees for hierarchical clustering , 2002, J. Comput. Syst. Sci..

[40]  Graham J.G. Upton A Model Taxonomy , 2016 .

[41]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[42]  Maria-Florina Balcan,et al.  Clustering with Interactive Feedback , 2008, ALT.

[43]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[44]  Moses Charikar,et al.  Hierarchical Clustering with Structural Constraints , 2018, ICML.

[45]  Eli V. Olinick,et al.  The use of sparsest cuts to reveal the hierarchical community structure of social networks , 2008, Soc. Networks.

[46]  Aurko Roy,et al.  Hierarchical Clustering via Spreading Metrics , 2016, NIPS.