Building Natural Language Interfaces to Web APIs

As the Web evolves towards a service-oriented architecture, application program interfaces (APIs) are becoming an increasingly important way to provide access to data, services, and devices. We study the problem of natural language interface to APIs (NL2APIs), with a focus on web APIs for web services. Such NL2APIs have many potential benefits, for example, facilitating the integration of web services into virtual assistants. We propose the first end-to-end framework to build an NL2API for a given web API. A key challenge is to collect training data, i.e., NL command-API call pairs, from which an NL2API can learn the semantic mapping from ambiguous, informal NL commands to formal API calls. We propose a novel approach to collect training data for NL2API via crowdsourcing, where crowd workers are employed to generate diversified NL commands. We optimize the crowdsourcing process to further reduce the cost. More specifically, we propose a novel hierarchical probabilistic model for the crowdsourcing process, which guides us to allocate budget to those API calls that have a high value for training NL2APIs. We apply our framework to real-world APIs, and show that it can collect high-quality training data at a low cost, and build NL2APIs with good performance from scratch. We also show that our modeling of the crowdsourcing process can improve its effectiveness, such that the training data collected via our approach leads to better performance of NL2APIs than a strong baseline.

[1]  Brian M. Sadler,et al.  On Generating Characteristic-rich Question Sets for QA Evaluation , 2016, EMNLP.

[2]  W. Bruce Croft,et al.  A general language model for information retrieval , 1999, CIKM '99.

[3]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[4]  Quan Z. Sheng,et al.  Web Services Foundations , 2013, Springer New York.

[5]  Jerome R. Bellegarda,et al.  Spoken Language Understanding for Natural Interaction: The Siri Experience , 2012, Natural Interaction with Robots, Knowbots and Smartphones, Putting Spoken Dialog Systems into Practice.

[6]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[7]  Sumit Gulwani,et al.  SmartSynth: synthesizing smartphone automation scripts from natural language , 2013, MobiSys '13.

[8]  Raymond J. Mooney,et al.  Language to Code: Learning Semantic Parsers for If-This-Then-That Recipes , 2015, ACL.

[9]  Ting-Hao Huang,et al.  Guardian: A Crowd-Powered Spoken Dialog System for Web APIs , 2015, HCOMP.

[10]  Gerhard Weikum,et al.  Deep answers for naturally asked questions on the web of data , 2012, WWW.

[11]  Jonathan Berant,et al.  Semantic Parsing via Paraphrasing , 2014, ACL.

[12]  Amit P. Sheth,et al.  A Faceted Classification Based Approach to Search and Rank Web APIs , 2008, 2008 IEEE International Conference on Web Services.

[13]  Prasant Mohapatra,et al.  Characterizing Mobile Open APIs in smartphone apps , 2014, 2014 IFIP Networking Conference.

[14]  Percy Liang,et al.  Data Recombination for Neural Semantic Parsing , 2016, ACL.

[15]  Frank Leymann,et al.  Web Services , 2004, Informatik-Spektrum.

[16]  William A. Woods,et al.  Progress in natural language understanding: an application to lunar geology , 1973, AFIPS National Computer Conference.

[17]  David D. Lewis,et al.  Heterogeneous Uncertainty Sampling for Supervised Learning , 1994, ICML.

[18]  Young-Bum Kim,et al.  An overview of end-to-end language understanding and dialog management for personal digital assistants , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).

[19]  Xifeng Yan,et al.  Cross-domain Semantic Parsing via Paraphrasing , 2017, EMNLP.

[20]  Fei Li,et al.  Constructing an Interactive Natural Language Interface for Relational Databases , 2014, Proc. VLDB Endow..

[21]  Jonathan Berant,et al.  Building a Semantic Parser Overnight , 2015, ACL.

[22]  CHENGXIANG ZHAI,et al.  A study of smoothing methods for language models applied to information retrieval , 2004, TOIS.

[23]  Luke S. Zettlemoyer,et al.  Learning to Map Sentences to Logical Form: Structured Classification with Probabilistic Categorial Grammars , 2005, UAI.

[24]  John B. Lowe,et al.  The Berkeley FrameNet Project , 1998, ACL.

[25]  Hailong Sun,et al.  Temporal QoS-aware web service recommendation via non-negative tensor factorization , 2014, WWW.

[26]  Jason Weston,et al.  Large-scale Simple Question Answering with Memory Networks , 2015, ArXiv.

[27]  Andrew Chou,et al.  Semantic Parsing on Freebase from Question-Answer Pairs , 2013, EMNLP.

[28]  Ming-Wei Chang,et al.  Semantic Parsing via Staged Query Graph Generation: Question Answering with Knowledge Base , 2015, ACL.

[29]  Vlad Trifa,et al.  Interacting with the SOA-Based Internet of Things: Discovery, Query, Selection, and On-Demand Provisioning of Web Services , 2010, IEEE Transactions on Services Computing.

[30]  Carlo Batini,et al.  Quality-driven Extraction, Fusion and Matchmaking of Semantic Web API Descriptions , 2012, J. Web Eng..

[31]  Percy Liang,et al.  Compositional Semantic Parsing on Semi-Structured Tables , 2015, ACL.

[32]  Sumit Gulwani,et al.  NLyze: interactive programming by natural language for spreadsheet data analysis and manipulation , 2014, SIGMOD Conference.

[33]  Gustavo Alonso,et al.  Basic Web Services Technology , 2004 .

[34]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[35]  Hao Ma,et al.  Table Cell Search for Question Answering , 2016, WWW.