Synthesizing Number Transformations from Input-Output Examples

Numbers are one of the most widely used data type in programming languages. Number transformations like formatting and rounding present a challenge even for experienced programmers as they find it difficult to remember different number format strings supported by different programming languages. These transformations present an even bigger challenge for end-users of spreadsheet systems like Microsoft Excel where providing such custom format strings is beyond their expertise. In our extensive case study of help forums of many programming languages and Excel, we found that both programmers and end-users struggle with these number transformations, but are able to easily express their intent using input-output examples. In this paper, we present a framework that can learn such number transformations from very few input-output examples. We first describe an expressive number transformation language that can model these transformations, and then present an inductive synthesis algorithm that can learn all expressions in this language that are consistent with a given set of examples. We also present a ranking scheme of these expressions that enables efficient learning of the desired transformation from very few examples. By combining our inductive synthesis algorithm for number transformations with an inductive synthesis algorithm for syntactic string transformations, we are able to obtain an inductive synthesis algorithm for manipulating data types that have numbers as a constituent sub-type such as date, unit, and time. We have implemented our algorithms as an Excel add-in and have evaluated it successfully over several benchmarks obtained from the help forums and the Excel product team.

[1]  Sumit Gulwani,et al.  Automating string processing in spreadsheets using input-output examples , 2011, POPL '11.

[2]  Sumit Gulwani,et al.  Learning Semantic String Transformations from Examples , 2012, Proc. VLDB Endow..

[3]  Sumit Gulwani,et al.  Synthesis of loop-free programs , 2011, PLDI '11.

[4]  Mary Shaw,et al.  Topes , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[5]  Tessa Lau,et al.  Why PBD systems fail: Lessons learned for usable AI , 2008 .

[6]  Sumit Gulwani,et al.  Oracle-guided component-based program synthesis , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[7]  Pedro M. Domingos,et al.  Programming by Demonstration Using Version Space Algebra , 2003, Machine Learning.

[8]  Rob Miller,et al.  Interactive Simultaneous Editing of Multiple Text Regions , 2001, USENIX ATC, General Track.

[9]  Armando Solar-Lezama,et al.  Programming by sketching for bit-streaming programs , 2005, PLDI '05.

[10]  Armando Solar-Lezama,et al.  Sketching concurrent data structures , 2008, PLDI '08.

[11]  Sumit Gulwani,et al.  Path-based inductive synthesis for program inversion , 2011, PLDI '11.

[12]  Jeffrey Heer,et al.  Wrangler: interactive visual specification of data transformation scripts , 2011, CHI.

[13]  Rishabh Singh,et al.  Synthesizing data structure manipulations from storyboards , 2011, ESEC/FSE '11.

[14]  Sofia Cassel,et al.  Graph-Based Algorithms for Boolean Function Manipulation , 2012 .

[15]  Henry Lieberman,et al.  Watch what I do: programming by demonstration , 1993 .

[16]  Sumit Gulwani,et al.  Automatically Generating Algebra Problems , 2012, AAAI.

[17]  Sumit Gulwani,et al.  Synthesizing geometry constructions , 2011, PLDI '11.

[18]  Sumit Gulwani,et al.  Spreadsheet table transformations from examples , 2011, PLDI '11.

[19]  Ruzica Piskac,et al.  Interactive Synthesis of Code Snippets , 2011, CAV.

[20]  Patrick Cousot,et al.  Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints , 1977, POPL.

[21]  Sumit Gulwani,et al.  Spreadsheet data manipulation using examples , 2012, CACM.

[22]  Neil Immerman,et al.  A simple inductive synthesis methodology and its applications , 2010, OOPSLA.

[23]  Sumit Gulwani,et al.  Type-directed completion of partial expressions , 2012, PLDI.

[24]  Daniel S. Weld,et al.  Programming by Demonstration , 2021, Computer Vision.

[25]  Sumit Gulwani,et al.  Dimensions in program synthesis , 2010, Formal Methods in Computer Aided Design.