A Novel Graph-based Approach for Determining Molecular Similarity

In this paper, we tackle the problem of measuring similarity among graphs that represent real objects with noisy data. To account for noise, we relax the definition of similarity using the maximum weighted co-$k$-plex relaxation method, which allows dissimilarities among graphs up to a predetermined level. We then formulate the problem as a novel quadratic unconstrained binary optimization problem that can be solved by a quantum annealer. The context of our study is molecular similarity where the presence of noise might be due to regular errors in measuring molecular features. We develop a similarity measure and use it to predict the mutagenicity of a molecule. Our results indicate that the relaxed similarity measure, designed to accommodate the regular errors, yields a higher prediction accuracy than the measure that ignores the noise.

[1]  Endika Bengoetxea,et al.  Inexact Graph Matching Using Estimation of Distribution Algorithms , 2002 .

[2]  George C. Verghese,et al.  Graph similarity scoring and matching , 2008, Appl. Math. Lett..

[3]  H. van de Waterbeemd,et al.  ADMET in silico modelling: towards prediction paradise? , 2003, Nature reviews. Drug discovery.

[4]  Masoud Mohseni,et al.  Computational Role of Multiqubit Tunneling in a Quantum Annealer , 2015 .

[5]  Balabhaskar Balasundaram,et al.  Graph Theoretic Clique Relaxations and Applications , 2013 .

[6]  Peter Willett,et al.  Maximum common subgraph isomorphism algorithms for the matching of chemical structures , 2002, J. Comput. Aided Mol. Des..

[7]  Endre Boros,et al.  On quadratization of pseudo-Boolean functions , 2012, ISAIM.

[8]  Vasil S. Denchev,et al.  Computational multiqubit tunnelling in programmable quantum annealers , 2015, Nature Communications.

[9]  T. Poggio,et al.  Prediction of central nervous system embryonal tumour outcome based on gene expression , 2002, Nature.

[10]  Sergiy Butenko,et al.  Clique Relaxations in Social Network Analysis: The Maximum k-Plex Problem , 2011, Oper. Res..

[11]  M. W. Johnson,et al.  Entanglement in a Quantum Annealing Processor , 2014, 1401.3500.

[12]  King-Sun Fu,et al.  An Image Understanding System Using Attributed Symbolic Representation and Inexact Graph-Matching , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  M. W. Johnson,et al.  Quantum annealing with manufactured spins , 2011, Nature.

[14]  Klaus-Robert Müller,et al.  Benchmark Data Set for in Silico Prediction of Ames Mutagenicity , 2009, J. Chem. Inf. Model..

[15]  Knut Reinert,et al.  Genome alignment with graph data structures: a comparison , 2014, BMC Bioinformatics.

[16]  D. K. Friesen,et al.  A combinatorial algorithm for calculating ligand binding , 1984 .

[17]  Kesheng Wu,et al.  Solving the Optimal Trading Trajectory Problem Using a Quantum Annealer , 2015, IEEE Journal of Selected Topics in Signal Processing.

[18]  Peter Willett,et al.  Dissimilarity-Based Algorithms for Selecting Structurally Diverse Sets of Compounds , 1999, J. Comput. Biol..

[19]  D. Slonim From patterns to pathways: gene expression data analysis comes of age , 2002, Nature Genetics.

[20]  Nagiza F. Samatova,et al.  The Maximum Common Subgraph Problem: Faster Solutions via Vertex Cover , 2007, 2007 IEEE/ACS International Conference on Computer Systems and Applications.

[21]  Bryan O'Gorman,et al.  A case study in programming a quantum annealer for hard operational planning problems , 2014, Quantum Information Processing.

[22]  E. Tosatti,et al.  Optimization using quantum mechanics: quantum annealing through adiabatic evolution , 2006 .

[23]  Kuo-Chin Fan,et al.  Genetic-based search for error-correcting graph isomorphism , 1997, IEEE Trans. Syst. Man Cybern. Part B.

[24]  H. Bunke Graph Matching : Theoretical Foundations , Algorithms , and Applications , 2022 .

[25]  Marvin Johnson,et al.  Concepts and applications of molecular similarity , 1990 .

[26]  Giorgios Kollias,et al.  Fast parallel algorithms for graph similarity and matching , 2014, J. Parallel Distributed Comput..

[27]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[28]  Vicky Choi,et al.  Minor-embedding in adiabatic quantum computation: I. The parameter setting problem , 2008, Quantum Inf. Process..

[29]  Michael R. Fellows,et al.  Fixed-Parameter Tractability and Completeness II: On Completeness for W[1] , 1995, Theor. Comput. Sci..

[30]  D. Venturelli,et al.  Quantum Annealing Implementation of Job-Shop Scheduling , 2015, 1506.08479.

[31]  Endre Boros,et al.  Pseudo-Boolean optimization , 2002, Discret. Appl. Math..

[32]  Albert Solé Ribalta Multiple graph matching and applications , 2012 .

[33]  Feixiong Cheng,et al.  In silico Prediction of Chemical Ames Mutagenicity , 2012, J. Chem. Inf. Model..

[34]  Naomi Nishimura,et al.  The Complexity of Subgraph Isomorphism for Classes of Partial k-Trees , 1996, Theor. Comput. Sci..

[35]  E. Farhi,et al.  A Quantum Adiabatic Evolution Algorithm Applied to Random Instances of an NP-Complete Problem , 2001, Science.