Analysis of reactive search optimisation techniques for the maximum clique problem and applications

This thesis introduces analysis tools for improving the current state of the art of heuristics for the Maximum Clique (MC) problem. The analysis focusses on algorithmic building blocks, on their contribution in solving hard instances of the MC problem, and on the development of new tools for the visualisation of search landscapes. As a result of the analysis on the algorithmic building blocks, we re-engineer an existing Reactive Local Search heuristic for the Maximum Clique (RLS-MC. We propose implementation and algorithmic improvements over the original RLS-MC aimed at faster restarts and greater diversification. The newly designed algorithm (RLS-LTM) is one order of magnitude faster than the original RLS-MC on some benchmark instances; but the proposed algorithmic changes impact also on the dynamically adjusted tabu tenure, which grows wildly on some hard instances. A more in depth analysis of the search dynamics of RLS-MC and RLS-LTM reveals the reasons behind the tabu tenure explosion and sheds some new light on the reactive mechanism. We design and implement RLS-fast which cures the issues with the tabu tenure explosion in RLS-LTM while retaining the performance improvement over RLS-MC. Moreover, building on the knowledge gained from the analysis, we propose a new hyper-heuristic which defines the new state of the art, and a novel supervised clustering technique based on a clique-finding component.

[1]  J. Jeffry Howbert,et al.  The Maximum Clique Problem , 2007 .

[2]  Thorsten Joachims,et al.  Supervised clustering with support vector machines , 2005, ICML.

[3]  Gail J. Bartlett,et al.  Analysis of catalytic residues in enzyme active sites. , 2002, Journal of molecular biology.

[4]  Mauro Brunato,et al.  Predicting Structural and Functional Sites in Proteins by Searching for Maximum-weight Cliques , 2010, AAAI.

[5]  Andrew Lumsdaine,et al.  A Component Architecture for LAM/MPI , 2003, PVM/MPI.

[6]  Mauro Brunato,et al.  Cooperating local search for the maximum clique problem , 2011, J. Heuristics.

[7]  Pierre Hansen,et al.  Variable neighborhood search for the maximum clique , 2001, Discret. Appl. Math..

[8]  Mario Köppen,et al.  Visualization of Pareto-Sets in Evolutionary Multi-Objective Optimization , 2007, 7th International Conference on Hybrid Intelligent Systems (HIS 2007).

[9]  Ayellet Tal,et al.  Online Dynamic Graph Drawing , 2008, IEEE Transactions on Visualization and Computer Graphics.

[10]  Steven Halim,et al.  Designing and Tuning SLS Through Animation and Graphics: An Extended Walk-Through , 2007, SLS.

[11]  Wayne J. Pullan,et al.  Dynamic Local Search for the Maximum Clique Problem , 2011, J. Artif. Intell. Res..

[12]  Roberto Battiti,et al.  Reactive and dynamic local search for max-clique: Engineering effective building blocks , 2010, Comput. Oper. Res..

[13]  Hartmut Pohlheim,et al.  Visualization of evolutionary algorithms - set of standard techniques and multidimensional visualization , 1999 .

[14]  Ken Perlin,et al.  Human-guided simple search: combining information visualization and heuristic search , 1999, NPIVM '99.

[15]  Dennis R. Livesay,et al.  How accurate and statistically robust are catalytic site predictions based on closeness centrality? , 2007, BMC Bioinformatics.

[17]  Mauro Brunato,et al.  On Effectively Finding Maximal Quasi-cliques in Graphs , 2008, LION.

[18]  Roberto Battiti,et al.  Greedy, Prohibition, and Reactive Heuristics for Graph Partitioning , 1999, IEEE Trans. Computers.

[19]  Ying Wei,et al.  Partial Order Optimum Likelihood (POOL): Maximum Likelihood Prediction of Protein Active Site Residues Using 3D Structure and Sequence Properties , 2009, PLoS Comput. Biol..

[20]  V. Sobolev,et al.  Prediction of transition metal‐binding sites from apo protein structures , 2007, Proteins.

[21]  Stephen A. Cook,et al.  The complexity of theorem-proving procedures , 1971, STOC.

[22]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[23]  Jessica C. Ebert,et al.  Robust recognition of zinc binding sites in proteins , 2007, Protein science : a publication of the Protein Society.

[24]  Kengo Katayama,et al.  Solving the maximum clique problem by k-opt local search , 2004, SAC '04.

[25]  Christine Solnon,et al.  A study of ACO capabilities for solving the maximum clique problem , 2006, J. Heuristics.

[26]  Richard M. Karp,et al.  Reducibility Among Combinatorial Problems , 1972, 50 Years of Integer Programming.

[27]  Pavel A. Pevzner,et al.  Combinatorial Approaches to Finding Subtle Signals in DNA Sequences , 2000, ISMB.

[28]  Wayne J. Pullan,et al.  Phased local search for the maximum clique problem , 2006, J. Comb. Optim..

[29]  B. Rost,et al.  Identifying cysteines and histidines in transition‐metal‐binding sites using support vector machines and neural networks , 2006, Proteins.

[31]  Paolo Frasconi,et al.  MetalDetector v2.0: predicting the geometry of metal binding sites from protein sequence , 2011, Nucleic Acids Res..

[32]  Egon Balas,et al.  Finding a Maximum Clique in an Arbitrary Graph , 1986, SIAM J. Comput..

[33]  Raymond J. Mooney,et al.  Semi-supervised clustering: probabilistic models, algorithms and experiments , 2005 .

[34]  Roberto Battiti,et al.  Reactive Local Search for the Maximum Clique Problem1 , 2001, Algorithmica.

[35]  Federico Della Croce,et al.  Combining Swaps and Node Weights in an Adaptive Greedy Approach for the Maximum Clique Problem , 2004, J. Heuristics.

[36]  Mauro Brunato,et al.  Techniques and Tools for Local Search Landscape Visualization and Analysis , 2009, SLS.

[37]  Wilbert E. Wilhelm,et al.  Clique-detection models in computational biochemistry and genomics , 2006, Eur. J. Oper. Res..

[38]  Peter Eades,et al.  A Heuristic for Graph Drawing , 1984 .

[39]  Andrea Passerini,et al.  Automatic prediction of catalytic residues by modeling residue structural neighborhood , 2010, BMC Bioinformatics.

[40]  Elena Marchiori,et al.  Genetic, Iterated and Multistart Local Search for the Maximum Clique Problem , 2002, EvoWorkshops.

[41]  Roberto Battiti,et al.  The Reactive Tabu Search , 1994, INFORMS J. Comput..

[42]  Xing Xu,et al.  A graph theoretical approach for predicting common RNA secondary structure motifs including pseudoknots in unaligned sequences , 2004, Bioinform..

[43]  Greg Burns,et al.  LAM: An Open Cluster Environment for MPI , 2002 .

[44]  Ulrik Brandes,et al.  Efficient generation of large random networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[45]  Éric D. Taillard,et al.  Robust taboo search for the quadratic assignment problem , 1991, Parallel Comput..

[46]  Janet M. Thornton,et al.  The Catalytic Site Atlas: a resource of catalytic sites and residues identified in enzymes using structural data , 2004, Nucleic Acids Res..

[47]  J. Håstad Clique is hard to approximate withinn1−ε , 1999 .

[48]  Mauro Brunato,et al.  Cooperative Strategies and Reactive Search: A Hybrid Model Proposal , 2009, LION.

[49]  Joseph C. Culberson,et al.  Camouflaging independent sets in quasi-random graphs , 1993, Cliques, Coloring, and Satisfiability.

[50]  Stephen Curial,et al.  Effectively visualizing large networks through sampling , 2005, VIS 05. IEEE Visualization, 2005..

[51]  Nanjiang Shu,et al.  BIOINFORMATICS ORIGINAL PAPER doi:10.1093/bioinformatics/btm618 Sequence analysis Prediction of zinc-binding sites in proteins from sequence , 2008 .

[52]  Graham Kendall,et al.  Hyper-Heuristics: An Emerging Direction in Modern Search Technology , 2003, Handbook of Metaheuristics.