Programming by Optimisation Meets Parameterised Algorithmics: A Case Study for Cluster Editing

Inspired by methods and theoretical results from parameterised algorithmics, we improve the state of the art in solving Cluster Editing, a prominent NP-hard clustering problem with applications in computational biology and beyond. In particular, we demonstrate that an extension of a certain preprocessing algorithm, called the \((k+1)\)-data reduction rule in parameterised algorithmics, embedded in a sophisticated branch-&-bound algorithm, improves over the performance of existing algorithms based on Integer Linear Programming (ILP) and branch-&-bound. Furthermore, our version of the \((k+1)\)-rule outperforms the theoretically most effective preprocessing algorithm, which yields a 2k-vertex kernel. Notably, this 2k-vertex kernel is analysed empirically for the first time here. Our new algorithm was developed by integrating Programming by Optimisation into the classical algorithm engineering cycle – an approach which we expect to be successful in many other contexts.

[1]  Yoshiko Wakabayashi,et al.  A cutting plane algorithm for a clustering problem , 1989, Math. Program..

[2]  Mikkel Thorup,et al.  On the approximability of numerical taxonomy (fitting distances by tree metrics) , 1996, SODA '96.

[3]  Giuseppe Cattaneo,et al.  Algorithm engineering , 1999, CSUR.

[4]  Rolf Niedermeier,et al.  Graph-Modeled Data Clustering: Exact Algorithms for Clique Generation , 2005, Theory of Computing Systems.

[5]  Avrim Blum,et al.  Correlation Clustering , 2004, Machine Learning.

[6]  Venkatesan Guruswami,et al.  Clustering with qualitative information , 2005, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..

[7]  Nir Ailon,et al.  Fitting tree metrics: Hierarchical clustering and phylogeny , 2005, 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05).

[8]  Rolf Niedermeier,et al.  Invitation to Fixed-Parameter Algorithms , 2006 .

[9]  Michael R. Fellows,et al.  Efficient Parameterized Preprocessing for Cluster Editing , 2007, FCT.

[10]  David P. Williamson,et al.  Deterministic Algorithms for Rank Aggregation and Other Ranking and Clustering Problems , 2007, WAOA.

[11]  Sebastian Böcker,et al.  Exact Algorithms for Cluster Editing: Evaluation and Experiments , 2008, WEA.

[12]  Jiong Guo A more effective linear kernelization for cluster editing , 2009, Theor. Comput. Sci..

[13]  Christian Komusiewicz,et al.  Exact Algorithms and Experiments for Hierarchical Tree Clustering , 2010, AAAI.

[14]  Christian Komusiewicz,et al.  A More Relaxed Model for Graph-Based Data Clustering: s-Plex Cluster Editing , 2010, SIAM J. Discret. Math..

[15]  Thomas Stützle,et al.  An incremental particle swarm for large-scale continuous optimization problems: an example of tuning-in-the-loop (re)design of optimization algorithms , 2011, Soft Comput..

[16]  Kevin Leyton-Brown,et al.  Sequential Model-Based Optimization for General Algorithm Configuration , 2011, LION.

[17]  Sebastian Böcker,et al.  A Golden Ratio Parameterized Algorithm for Cluster Editing , 2011, IWOCA.

[18]  Holger H. Hoos,et al.  Programming by optimization , 2012, Commun. ACM.

[19]  Francesco Gullo,et al.  Chromatic correlation clustering , 2012, KDD.

[20]  Jianer Chen,et al.  A 2k kernel for the cluster editing problem , 2012, J. Comput. Syst. Sci..

[21]  Jianer Chen,et al.  On Parameterized and Kernelization Algorithms for the Hierarchical Clustering Problem , 2013, TAMC.

[22]  Michael R. Fellows,et al.  Review of: Fundamentals of Parameterized Complexity by Rodney G. Downey and Michael R. Fellows , 2015, SIGA.

[23]  Peter Sanders,et al.  Algorithm Engineering , 2013, Informatik-Spektrum.

[24]  Sebastian Böcker,et al.  Cluster Editing , 2013, CiE.

[25]  Ravi Kumar,et al.  Correlation clustering in MapReduce , 2014, KDD.

[26]  Holger H. Hoos,et al.  Analysing differences between algorithm configurations through ablation , 2015, Journal of Heuristics.