Learning to Represent Edits

We introduce the problem of learning distributed representations of edits. By combining a "neural editor" with an "edit encoder", our models learn to represent the salient information of an edit and can be used to apply edits to new inputs. We experiment on natural language and source code edit data. Our evaluation yields promising results that suggest that our neural network models learn to capture the structure and semantics of edits. We hope that this interesting task and data source will inspire other researchers to work further on this problem.

[1]  Dawn Xiaodong Song,et al.  Tree-to-tree Neural Networks for Program Translation , 2018, NeurIPS.

[2]  Ron Artstein,et al.  Survey Article: Inter-Coder Agreement for Computational Linguistics , 2008, CL.

[3]  Hridesh Rajan,et al.  A study of repetitiveness of code changes in software evolution , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[4]  Zhoujun Li,et al.  Response Generation by Context-aware Prototype Editing , 2018, AAAI.

[5]  Percy Liang,et al.  Generating Sentences by Editing Prototypes , 2017, TACL.

[6]  Loris D'Antoni,et al.  Learning Quick Fixes from Code Repositories , 2018, SBES.

[7]  Zhiting Hu,et al.  Improved Variational Autoencoders for Text Modeling using Dilated Convolutions , 2017, ICML.

[8]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[9]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[10]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[11]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[12]  Richard S. Zemel,et al.  Gated Graph Sequence Neural Networks , 2015, ICLR.

[13]  Manaal Faruqui,et al.  WikiAtomicEdits: A Multilingual Corpus of Wikipedia Edits for Modeling Language and Discourse , 2018, EMNLP.

[14]  Honglak Lee,et al.  Learning Structured Output Representation using Deep Conditional Generative Models , 2015, NIPS.

[15]  Qi Liu,et al.  Constrained Graph Variational Autoencoders for Molecule Design , 2018, NeurIPS.

[16]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[17]  Daniel Tarlow,et al.  Structured Generative Models of Natural Source Code , 2014, ICML.

[18]  Sumit Gulwani,et al.  Learning Syntactic Program Transformations from Examples , 2016, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[19]  Percy Liang,et al.  A Retrieve-and-Edit Framework for Predicting Structured Outputs , 2018, NeurIPS.

[20]  Samy Bengio,et al.  Generating Sentences from a Continuous Space , 2015, CoNLL.

[21]  Manaal Faruqui,et al.  Identifying Well-formed Natural Language Questions , 2018, EMNLP.

[22]  Petar Tsankov,et al.  Inferring crypto API rules from code changes , 2018, PLDI.

[23]  Graham Neubig,et al.  TRANX: A Transition-based Neural Abstract Syntax Parser for Semantic Parsing and Code Generation , 2018, EMNLP.

[24]  Nikos Komodakis,et al.  GraphVAE: Towards Generation of Small Graphs Using Variational Autoencoders , 2018, ICANN.

[25]  Kien A. Hua,et al.  Deep Differential Recurrent Neural Networks , 2018, ArXiv.

[26]  Gaurav Pandey,et al.  Exemplar Encoder-Decoder for Neural Conversation Generation , 2018, ACL.

[27]  Guo-Jun Qi,et al.  Differential Recurrent Neural Networks for Action Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[28]  Aaron Halfaker,et al.  Identifying Semantic Edit Intentions from Revisions in Wikipedia , 2017, EMNLP.

[29]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[30]  Quoc V. Le,et al.  Semi-supervised Sequence Learning , 2015, NIPS.