Building Bing Developer Assistant MSR-TR-2015-36

Software developers heavily rely on code snippets and API usage examples searched on the Internet. This paper presents Bing Code Search, a Visual Studio extension that allows developers to write, within an IDE, free-form natural language questions, and get C# code snippets answering those questions. Bing Code Search automatically adapts the suggested snippets into the user’s programming context via variable renaming, and records users’ interactions to improve its suggestions. Compared to prior related research, Bing Code Search provides a complete automation of the full search-paste-adapt process. Three weeks after we released this free extension, more than 20,000 users downloaded it; and they issue on average 3,000 queries per day. We believe that Bing Code Search is the most widely used tool in its category. In the following, we fully describe our framework, and draw clear empirical evidence of the benefits of Bing Code Search: (1) From our evaluation benchmark, compared with Bing’s result, Bing Code Search delivers more relevant snippet solutions. (2) In a controlled experiment, it was able to save developers 28% of time on completing API related tasks. (3) Telemetries collected from thousands of users show some users already built up the habit of using the tool: they issue multiple queries to solve a complex task, or use it as a fast auto-completion.

[1]  Thomas Fritz,et al.  Sando: an extensible local code search framework , 2012, SIGSOFT FSE.

[2]  Tao Xie,et al.  Parseweb: a programmer assistant for reusing open source code on the web , 2007, ASE.

[3]  Koushik Sen,et al.  SNIFF: A Search Engine for Java Using Free-Form Queries , 2009, FASE.

[4]  Rob Miller,et al.  Keyword programming in java , 2007, ASE '07.

[5]  Eran Yahav,et al.  Typestate-based semantic code search over partial programs , 2012, OOPSLA '12.

[6]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[7]  Kajal T. Claypool,et al.  XSnippet: mining For sample code , 2006, OOPSLA '06.

[8]  Armando Solar-Lezama,et al.  Programming by sketching for bit-streaming programs , 2005, PLDI '05.

[9]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[10]  Leonardo Mariani,et al.  Automatic generation of software behavioral models , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[11]  Kai Chen,et al.  Mining succinct and high-coverage API usage patterns from source code , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[12]  Joel Brandt,et al.  Codelets: linking interactive documentation and example code in the editor , 2012, CHI.

[13]  Thore Graepel,et al.  Large Margin Rank Boundaries for Ordinal Regression , 2000 .

[14]  Steven P. Reiss,et al.  Semantics-based code search , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[15]  Andreas Zeller,et al.  Mining temporal specifications from object usage , 2011, Automated Software Engineering.

[16]  Hung Viet Nguyen,et al.  Graph-based pattern-oriented, context-sensitive source code completion , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[17]  Sumit Gulwani,et al.  SmartSynth: synthesizing smartphone automation scripts from natural language , 2013, MobiSys '13.

[18]  Karen Spärck Jones A statistical interpretation of term specificity and its application in retrieval , 2021, J. Documentation.

[19]  Armando Solar-Lezama,et al.  Data-driven synthesis for object-oriented frameworks , 2011, OOPSLA '11.

[20]  Rastislav Bodík,et al.  Jungloid mining: helping to navigate the API jungle , 2005, PLDI '05.

[21]  Scott R. Klemmer,et al.  Example-centric programming: integrating web search into the development environment , 2010, CHI.

[22]  Roel Vertegaal,et al.  SnipMatch: using source code context to enhance snippet retrieval and parameterization , 2012, UIST.

[23]  Sumit Gulwani,et al.  From program verification to program synthesis , 2010, POPL '10.

[24]  Sumit Gulwani,et al.  Automating string processing in spreadsheets using input-output examples , 2011, POPL '11.

[25]  Jinqiu Yang,et al.  Inferring semantically related words from software context , 2012, 2012 9th IEEE Working Conference on Mining Software Repositories (MSR).