Simulated annealing for optimization of graphs and sequences

Optimization of discrete structures aims at generating a new structure with the better property given an existing one, which is a fundamental problem in machine learning. Different from the continuous optimization, the realistic applications of discrete optimization (e.g., text generation) are very challenging due to the complex and long-range constraints, including both syntax and semantics, in discrete structures. In this work, we present SAGS, a novel Simulated Annealing framework for Graph and Sequence optimization. The key idea is to integrate powerful neural networks into metaheuristics (e.g., simulated annealing, SA) to restrict the search space in discrete optimization. We start by defining a sophisticated objective function, involving the property of interest and pre-defined constraints (e.g., grammar validity). SAGS searches from the discrete space towards this objective by performing a sequence of local edits, where deep generative neural networks propose the editing content and thus can control the quality of editing. We evaluate SAGS on paraphrase generation and molecule generation for sequence optimization and graph optimization, respectively. Extensive results show that our approach achieves state-of-the-art performance compared with existing paraphrase generation methods in terms of both automatic and human evaluations. Further, SAGS also significantly outperforms all the previous methods in molecule generation.? † These authors contributed equally to this work. ‡ This paper is an extension to Liu et al. (2020b), published at ACL 2020. There is more than 40% new material, including graph optimization algorithms and several experiments on molecule generation. The code of our work is available at: https://github.com/liuxg16/UPSA ∗ To whom correspondence should be addressed. Email: songsen@mail.tsinghua.edu.cn ? This article is an accepted manuscript of Neurocomputing and under the CC-BY-NC-ND license. The formal publication of this manuscript is: Xianggen Liu, Pengyong Li, Fandong Meng, Hao Zhou, Huasong Zhong, Jie Zhou, Lili Mou, Sen Song. (2021). Simulated annealing for optimization of graphs and sequences. Neurocomputing, 465:310-324. https://doi.org/10.1016/j.neucom.2021.09.003 1 ar X iv :2 11 0. 01 38 4v 1 [ cs .L G ] 1 O ct 2 02 1

[1]  Felix Hieber,et al.  Using Target-side Monolingual Data for Neural Machine Translation through Multi-task Learning , 2017, EMNLP.

[2]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[3]  Lili Mou,et al.  Discrete Optimization for Unsupervised Sentence Summarization with Word-Level Extraction , 2020, ACL.

[4]  G. Ding Discrete optimization , 1977 .

[5]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[6]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[7]  Sen Song,et al.  Riboexp: an interpretable reinforcement learning framework for ribosome density modeling , 2021, Briefings Bioinform..

[8]  Arthur C. Sanderson,et al.  JADE: Adaptive Differential Evolution With Optional External Archive , 2009, IEEE Transactions on Evolutionary Computation.

[9]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[10]  Yun Sing Koh,et al.  A Survey of Sequential Pattern Mining , 2017 .

[11]  Fei Wang,et al.  MoFlow: An Invertible Flow Model for Generating Molecular Graphs , 2020, KDD.

[12]  Rahul Gupta,et al.  A task in a suit and a tie: paraphrase generation with semantic augmentation , 2018, AAAI.

[13]  Li Li,et al.  Optimization of Molecules via Deep Reinforcement Learning , 2018, Scientific Reports.

[14]  Leandro dos Santos Coelho,et al.  Coyote Optimization Algorithm: A New Metaheuristic for Global Optimization Problems , 2018, 2018 IEEE Congress on Evolutionary Computation (CEC).

[15]  Lili Mou,et al.  Iterative Edit-Based Unsupervised Sentence Simplification , 2020, ACL.

[16]  Graham Neubig,et al.  Lagging Inference Networks and Posterior Collapse in Variational Autoencoders , 2019, ICLR.

[17]  Sen Song,et al.  An effective self-supervised framework for learning expressive molecular global representations to drug discovery , 2021, Briefings Bioinform..

[18]  Nick Cramer,et al.  Automatic Keyword Extraction from Individual Documents , 2010 .

[19]  Shengyu Zhang,et al.  TrimNet: learning molecular representation from triplet messages for biomedicine , 2020, Briefings Bioinform..

[20]  Mi Zhang,et al.  A Geometrical Perspective on Image Style Transfer With Adversarial Learning , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[22]  Lei Li,et al.  CGMH: Constrained Sentence Generation by Metropolis-Hastings Sampling , 2018, AAAI.

[23]  Reza Rahimi Tabar,et al.  Simulated annealing optimization in wavefront shaping controlled transmission. , 2018, Applied optics.

[24]  Samy Bengio,et al.  Generating Sentences from a Continuous Space , 2015, CoNLL.

[25]  Jun Zhang,et al.  A Novel Set-Based Particle Swarm Optimization Method for Discrete Optimization Problems , 2010, IEEE Transactions on Evolutionary Computation.

[26]  Ye Zhang,et al.  A Sensitivity Analysis of (and Practitioners’ Guide to) Convolutional Neural Networks for Sentence Classification , 2015, IJCNLP.

[27]  Gordon M. Crippen,et al.  Prediction of Physicochemical Parameters by Atomic Contributions , 1999, J. Chem. Inf. Comput. Sci..

[28]  Michael R. Lyu,et al.  Unsupervised Text Generation by Learning from Search , 2020, NeurIPS.

[29]  Weinan Zhang,et al.  GraphAF: a Flow-based Autoregressive Model for Molecular Graph Generation , 2020, ICLR.

[30]  Jackie Chi Kit Cheung,et al.  EditNTS: An Neural Programmer-Interpreter Model for Sentence Simplification through Explicit Editing , 2019, ACL.

[31]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[32]  Matt J. Kusner,et al.  Grammar Variational Autoencoder , 2017, ICML.

[33]  Oladimeji Farri,et al.  Neural Paraphrase Generation with Stacked Residual LSTM Networks , 2016, COLING.

[34]  Hong Sun,et al.  Joint Learning of a Dual SMT System for Paraphrase Generation , 2012, ACL.

[35]  Ankush Gupta,et al.  A Deep Generative Framework for Paraphrase Generation , 2017, AAAI.

[36]  Alán Aspuru-Guzik,et al.  Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules , 2016, ACS central science.

[37]  Sen Song,et al.  A Chance-Constrained Generative Framework for Sequence Optimization , 2020, ICML.

[38]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[39]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[40]  Gautam Srivastava,et al.  Self-attention-based conditional random fields latent variables model for sequence labeling , 2021, Pattern Recognit. Lett..

[41]  Alexander M. Rush,et al.  Sequence-to-Sequence Learning as Beam-Search Optimization , 2016, EMNLP.

[42]  Matteo Pagliardini,et al.  Unsupervised Learning of Sentence Embeddings Using Compositional n-Gram Features , 2017, NAACL.

[43]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[44]  J. Carrasco,et al.  Recent Trends in the Use of Statistical Tests for Comparing Swarm and Evolutionary Computing Algorithms: Practical Guidelines and a Critical Review , 2020, Swarm Evol. Comput..

[45]  Qun Liu,et al.  Decomposable Neural Paraphrase Generation , 2019, ACL.

[46]  Brian D. Weitzner,et al.  De novo design of potent and selective mimics of IL-2 and IL-15 , 2019, Nature.

[47]  Frank Noé,et al.  Efficient multi-objective molecular optimization in a continuous latent space† †Electronic supplementary information (ESI) available: Details of the desirability scaling functions, high resolution figures and detailed results of the GuacaMol benchmark. See DOI: 10.1039/c9sc01928f , 2019, Chemical science.

[48]  Percy Liang,et al.  Generating Sentences by Editing Prototypes , 2017, TACL.

[49]  Jure Leskovec,et al.  Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation , 2018, NeurIPS.

[50]  Bruce T. Lowerre,et al.  The HARPY speech recognition system , 1976 .

[51]  Jie Zhou,et al.  Unsupervised Paraphrasing by Simulated Annealing , 2019, ACL.

[52]  Mark Fleischer Simulated annealing: past, present, and future , 1995, WSC '95.

[53]  Regina Barzilay,et al.  Junction Tree Variational Autoencoder for Molecular Graph Generation , 2018, ICML.

[54]  Nikolaus Hansen,et al.  Adapting arbitrary normal mutation distributions in evolution strategies: the covariance matrix adaptation , 1996, Proceedings of IEEE International Conference on Evolutionary Computation.

[55]  Yejin Choi,et al.  Baby talk: Understanding and generating simple image descriptions , 2011, CVPR 2011.

[56]  Mirko Krivánek,et al.  Simulated Annealing: A Proof of Convergence , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[57]  Regina Barzilay,et al.  Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment , 2003, NAACL.

[58]  Luca Maria Gambardella,et al.  Ant colony system: a cooperative learning approach to the traveling salesman problem , 1997, IEEE Trans. Evol. Comput..

[59]  Hua He,et al.  A Continuously Growing Dataset of Sentential Paraphrases , 2017, EMNLP.

[60]  Ryan G. Coleman,et al.  ZINC: A Free Tool to Discover Chemistry for Biology , 2012, J. Chem. Inf. Model..

[61]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[62]  Oren Etzioni,et al.  Paraphrase-Driven Learning for Open Question Answering , 2013, ACL.

[63]  Regina Barzilay,et al.  Hierarchical Generation of Molecular Graphs using Structural Motifs , 2020, ICML.

[64]  Raisa E Jones,et al.  Identification and optimization of small-molecule agonists of the human relaxin hormone receptor RXFP1 , 2013, Nature Communications.

[65]  Emile H. L. Aarts,et al.  Simulated Annealing: Theory and Applications , 1987, Mathematics and Its Applications.

[66]  Hang Li,et al.  Paraphrase Generation with Deep Reinforcement Learning , 2017, EMNLP.