DAVE: Deriving Automatically Verilog from English

Specifications for digital systems are provided in natural language, and engineers undertake significant efforts to translate these into the programming languages understood by compilers for digital systems. Automating this process allows designers to work with the language in which they are most comfortable - the original natural language - and focus instead on other downstream design challenges. We explore the use of state-of-the-art machine learning (ML) to automatically derive Verilog snippets from English via fine-tuning GPT-2, a natural language ML system. We describe our approach for producing a suitable dataset of novice-level digital design tasks and provide a detailed exploration of GPT-2, finding encouraging translation performance across our task sets (94.8% correct), with the ability to handle both simple and abstract design tasks.

[1]  Ian G. Harris,et al.  GLAsT: Learning formal grammars to translate natural language specifications into hardware assertions , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[2]  Henry Lieberman,et al.  NLP (Natural Language Processing) for NLP (Natural Language Programming) , 2006, CICLing.

[3]  Ian G. Harris,et al.  Automatic Assertion Generation from Natural Language Specifications Using Subtree Analysis , 2019, 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[4]  BengioYoshua,et al.  A neural probabilistic language model , 2003 .

[5]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[6]  Robert Wille,et al.  Accurate Cost Estimation of Memory Systems Inspired by Machine Learning for Computer Vision , 2019, 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[7]  Frank Vahid Digital Design with RTL Design, VHDL, and Verilog, , 2010 .

[8]  Alec Radford,et al.  Improving Language Understanding by Generative Pre-Training , 2018 .

[9]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[10]  Giovanni De Micheli,et al.  Developing Synthesis Flows Without Human Knowledge , 2018, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).

[11]  Xuanjing Huang,et al.  Recurrent Neural Network for Text Classification with Multi-Task Learning , 2016, IJCAI.

[12]  Andrew B. Kahng,et al.  Machine Learning Applications in Physical Design: Recent Results and Directions , 2018, ISPD.

[13]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[14]  Danqi Chen,et al.  CoQA: A Conversational Question Answering Challenge , 2018, TACL.

[15]  Hermann Ney,et al.  LSTM Neural Networks for Language Modeling , 2012, INTERSPEECH.

[16]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[17]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[18]  Christoph Treude,et al.  Extracting Development Tasks to Navigate Software Documentation , 2015, IEEE Transactions on Software Engineering.

[19]  Robert Wille,et al.  Generating formal system models from natural language descriptions , 2012, 2012 IEEE International High Level Design Validation and Test Workshop (HLDVT).