Summarizing Software API Usage Examples Using Clustering Techniques

As developers often use third-party libraries to facilitate software development, the lack of proper API documentation for these libraries undermines their reuse potential. And although several approaches extract usage examples for libraries, they are usually tied to specific language implementations, while their produced examples are often redundant and are not presented as concise and readable snippets. In this work, we propose a novel approach that extracts API call sequences from client source code and clusters them to produce a diverse set of source code snippets that effectively covers the target API. We further construct a summarization algorithm to present concise and readable snippets to the users. Upon evaluating our system on software libraries, we indicate that it achieves high coverage in API methods, while the produced snippets are of high quality and closely match handwritten examples.

[1]  Seung-won Hwang,et al.  Adding Examples into Java Documents , 2009, 2009 IEEE/ACM International Conference on Automated Software Engineering.

[2]  Charles A. Sutton,et al.  Mining idioms from source code , 2014, SIGSOFT FSE.

[3]  Jian Pei,et al.  MAPO: Mining and Recommending API Usage Patterns , 2009, ECOOP.

[4]  Bertrand Meyer,et al.  An Empirical Study of API Usability , 2013, 2013 ACM / IEEE International Symposium on Empirical Software Engineering and Measurement.

[5]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[6]  Jonathan I. Maletic,et al.  srcML: An Infrastructure for the Exploration, Analysis, and Manipulation of Source Code: A Tool Demonstration , 2013, 2013 IEEE International Conference on Software Maintenance.

[7]  Marco Tulio Valente,et al.  Mining usage patterns for the Android API , 2015, PeerJ Prepr..

[8]  Frank Maurer,et al.  What makes a good code example?: A study of programming Q&A in StackOverflow , 2012, 2012 28th IEEE International Conference on Software Maintenance (ICSM).

[9]  Martin P. Robillard,et al.  What Makes APIs Hard to Learn? Answers from Developers , 2009, IEEE Software.

[10]  Ricardo J. G. B. Campello,et al.  Density-Based Clustering Based on Hierarchical Density Estimates , 2013, PAKDD.

[11]  Kim Mens,et al.  Source Code-Based Recommendation Systems , 2014, Recommendation Systems in Software Engineering.

[12]  Miryung Kim,et al.  An Empirical Study of API Stability and Adoption in the Android Ecosystem , 2013, 2013 IEEE International Conference on Software Maintenance.

[13]  Gabriele Bavota,et al.  CodeTube: Extracting Relevant Fragments from Software Development Video Tutorials , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C).

[14]  Seung-won Hwang,et al.  Towards an Intelligent Code Search Engine , 2010, AAAI.

[15]  Jian Pei,et al.  MAPO: mining API usages from open source repositories , 2006, MSR '06.

[16]  Seung-won Hwang,et al.  Enriching Documents with Examples: A Corpus Mining Approach , 2013, TOIS.

[17]  Lu Zhang,et al.  Mining API Usage Examples from Test Code , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[18]  Clay Spinuzzi,et al.  Building More Usable APIs , 1998, IEEE Softw..

[19]  Houari A. Sahraoui,et al.  Mining Multi-level API Usage Patterns , 2015, 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[20]  Reid Holmes,et al.  Live API documentation , 2014, ICSE.

[21]  Leland McInnes,et al.  hdbscan: Hierarchical density based clustering , 2017, J. Open Source Softw..

[22]  Kai Chen,et al.  Mining succinct and high-coverage API usage patterns from source code , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[23]  Charles A. Sutton,et al.  Parameter-free probabilistic API mining across GitHub , 2015, SIGSOFT FSE.

[24]  Westley Weimer,et al.  Synthesizing API usage examples , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[25]  Raymond P. L. Buse,et al.  A metric for software readability , 2008, ISSTA '08.

[26]  Marco Tulio Valente,et al.  Documenting APIs with examples: Lessons learned with the APIMiner platform , 2013, 2013 20th Working Conference on Reverse Engineering (WCRE).

[27]  R. Suganya,et al.  Data Mining Concepts and Techniques , 2010 .

[28]  Nikolaus Augsten,et al.  Tree edit distance: Robust and memory-efficient , 2016, Inf. Syst..

[29]  Martin P. Robillard,et al.  How API Documentation Fails , 2015, IEEE Software.

[30]  Zhendong Su,et al.  DECKARD: Scalable and Accurate Tree-Based Detection of Code Clones , 2007, 29th International Conference on Software Engineering (ICSE'07).

[31]  Brad A. Myers,et al.  Improving API documentation using API usage information , 2009, 2009 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC).