Visualizing API Usage Examples at Scale

Using existing APIs properly is a key challenge in programming, given that libraries and APIs are increasing in number and complexity. Programmers often search for online code examples in Q&A forums and read tutorials and blog posts to learn how to use a given API. However, there are often a massive number of related code examples and it is difficult for a user to understand the commonalities and variances among them, while being able to drill down to concrete details. We introduce an interactive visualization for exploring a large collection of code examples mined from open-source repositories at scale. This visualization summarizes hundreds of code examples in one synthetic code skeleton with statistical distributions for canonicalized statements and structures enclosing an API call. We implemented this interactive visualization for a set of Java APIs and found that, in a lab study, it helped users (1) answer significantly more API usage questions correctly and comprehensively and (2) explore how other programmers have used an unfamiliar API.

[1]  James R. Larus,et al.  Mining specifications , 2002, POPL '02.

[2]  Hoan Anh Nguyen,et al.  Graph-based mining of multiple object usage patterns , 2009, ESEC/FSE '09.

[3]  Philip J. Guo,et al.  Two studies of opportunistic programming: interleaving web foraging, learning, and writing code , 2009, CHI.

[4]  Aditi Shrikumar Designing an Exploratory Text Analysis Tool for Humanities and Social Sciences Research , 2013 .

[5]  Björn Hartmann,et al.  Browsing and Analyzing the Command-Level Structure of Large Collections of Image Manipulation Tutorials , 2013 .

[6]  Thomas R. Gross,et al.  Automatic Generation of Object Usage Specifications from Large Method Traces , 2009, 2009 IEEE/ACM International Conference on Automated Software Engineering.

[7]  Martin P. Robillard,et al.  Asking and answering questions about unfamiliar APIs: An exploratory study , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[8]  Mark Weiser,et al.  Program Slicing , 1981, IEEE Transactions on Software Engineering.

[9]  Zhenmin Li,et al.  PR-Miner: automatically extracting implicit programming rules and detecting violations in large software code , 2005, ESEC/FSE-13.

[10]  Michael Backes,et al.  Stack Overflow Considered Harmful? The Impact of Copy&Paste on Android Application Security , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[11]  Marco Tulio Valente,et al.  Documenting APIs with examples: Lessons learned with the APIMiner platform , 2013, 2013 20th Working Conference on Reverse Engineering (WCRE).

[12]  Björn Hartmann,et al.  Delta: a tool for representing and comparing workflows , 2012, CHI.

[13]  Andreas Zeller,et al.  Learning from 6,000 projects: lightweight cross-project anomaly detection , 2010, ISSTA '10.

[14]  Christoph Treude,et al.  Understanding Stack Overflow Code Fragments , 2017, 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[15]  Martin Wattenberg,et al.  The Word Tree, an Interactive Visual Concordance , 2008, IEEE Transactions on Visualization and Computer Graphics.

[16]  Brad A. Myers,et al.  Six Learning Barriers in End-User Programming Systems , 2004, 2004 IEEE Symposium on Visual Languages - Human Centric Computing.

[17]  Westley Weimer,et al.  Synthesizing API usage examples , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[18]  Philip J. Guo,et al.  OverCode: visualizing variation in student solutions to programming problems at scale , 2014, ACM Trans. Comput. Hum. Interact..

[19]  Frances E. Allen,et al.  Control-flow analysis , 2022 .

[20]  Miryung Kim,et al.  Are Code Examples on an Online Q&A Forum Reliable?: A Study of API Misuse on Stack Overflow , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[21]  Martin P. Robillard,et al.  What Makes APIs Hard to Learn? Answers from Developers , 2009, IEEE Software.

[22]  Tao Xie,et al.  Alattin: mining alternative patterns for defect detection , 2011, Automated Software Engineering.

[23]  James Fogarty,et al.  Assieme: finding and leveraging implicit references in a web search interface for programmers , 2007, UIST '07.

[24]  Jing Zhou,et al.  API deprecation: a retrospective analysis and detection method for code examples on the web , 2016, SIGSOFT FSE.

[25]  Jonathan Sillito,et al.  Working with search results , 2009, 2009 ICSE Workshop on Search-Driven Development-Users, Infrastructure, Tools and Evaluation.

[26]  Kai Chen,et al.  Mining succinct and high-coverage API usage patterns from source code , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[27]  Hridesh Rajan,et al.  Boa: A language and infrastructure for analyzing ultra-large-scale software repositories , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[28]  David W. Binkley,et al.  Program slicing , 2008, 2008 Frontiers of Software Maintenance.

[29]  Mira Mezini,et al.  Detecting Missing Method Calls in Object-Oriented Software , 2010, ECOOP.

[30]  Cristina V. Lopes,et al.  How Well Do Search Engines Support Code Retrieval on the Web? , 2011, TSEM.

[31]  Kathryn T. Stolee,et al.  How developers search for code: a case study , 2015, ESEC/SIGSOFT FSE.

[32]  Jian Pei,et al.  MAPO: Mining and Recommending API Usage Patterns , 2009, ECOOP.