论文信息 - Antecedent Predictions Are Dominant for Tree-Based Code Generation

Antecedent Predictions Are Dominant for Tree-Based Code Generation

Code generation focuses on the automatic conversion of natural language (NL) utterances into code snippets. The sequence-to-tree (Seq2Tree) methods, e.g., TRANX, are proposed for code generation, with the guarantee of the compilability of the generated code, which generate the subsequent Abstract Syntax Tree (AST) node relying on antecedent predictions of AST nodes. Existing Seq2Tree methods tend to treat both antecedent predictions and subsequent predictions equally. However, under the AST constraints, it is difﬁcult for Seq2Tree models to produce the correct subsequent prediction based on incorrect antecedent predictions. Thus, antecedent predictions ought to receive more attention than subsequent predictions. To this end, in this paper, we propose an effective method, named APTRANX (Antecedent Prioritized TRANX), on the basis of TRANX. APTRANX contains an Antecedent Prioritized (AP) Loss, which helps the model attach importance to antecedent predictions by exploiting the position information of the generated AST nodes. With better antecedent predictions and accompanying subsequent predictions, APTRANX signiﬁcantly improves the performance. We conduct extensive experiments on several benchmark datasets, and the experimental results demonstrate the superiority and generality of our proposed method compared with the state-of-the-art methods.

Ge Li | Zhi Jin | Yihong Dong

[1] Fandong Meng,et al. Exploring Dynamic Selection of Branch Expansion Orders for Code Generation , 2021, ACL.

[2] Jianwei Cui,et al. Improving Tree-Structured Decoder Training for Code Generation via Mutual Learning , 2021, AAAI.

[3] Graham Neubig,et al. Incorporating External Knowledge through Pre-training for Natural Language to Code Generation , 2020, ACL.

[4] Lili Mou,et al. TreeGen: A Tree-Based Transformer Architecture for Code Generation , 2019, AAAI.

[5] Xin Xia,et al. Code Generation as a Dual Task of Code Summarization , 2019, NeurIPS.

[6] Graham Neubig,et al. Reranking for Neural Semantic Parsing , 2019, ACL.

[7] Oleksandr Polozov,et al. Program Synthesis and Semantic Parsing with Learned Code Idioms , 2019, NeurIPS.

[8] Lili Mou,et al. A Grammar-Based Structural CNN Decoder for Code Generation , 2018, AAAI.

[9] Graham Neubig,et al. TRANX: A Transition-based Neural Abstract Syntax Parser for Semantic Parsing and Code Generation , 2018, EMNLP.

[10] Graham Neubig,et al. Retrieval-Based Neural Code Generation , 2018, EMNLP.

[11] Graham Neubig,et al. Learning to Mine Aligned Code and Natural Language Pairs from Stack Overflow , 2018, 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR).