MAPO: Mining and Recommending API Usage Patterns

To improve software productivity, when constructing new software systems, programmers often reuse existing libraries or frameworks by invoking methods provided in their APIs. Those API methods, however, are often complex and not well documented. To get familiar with how those API methods are used, programmers often exploit a source code search tool to search for code snippets that use the API methods of interest. However, the returned code snippets are often large in number, and the huge number of snippets places a barrier for programmers to locate useful ones. In order to help programmers overcome this barrier, we have developed an API usage mining framework and its supporting tool called MAPO (Mining API usage Pattern from Open source repositories) for mining API usage patterns automatically. A mined pattern describes that in a certain usage scenario, some API methods are frequently called together and their usages follow some sequential rules. MAPO further recommends the mined API usage patterns and their associated code snippets upon programmers' requests. Our experimental results show that with these patterns MAPO helps programmers locate useful code snippets more effectively than two state-of-the-art code search tools. To investigate whether MAPO can assist programmers in programming tasks, we further conducted an empirical study. The results show that using MAPO, programmers produce code with fewer bugs when facing relatively complex API usages, comparing with using the two state-of-the-art code search tools.

[1]  Rajeev Alur,et al.  A Temporal Logic of Nested Calls and Returns , 2004, TACAS.

[2]  Zhendong Su,et al.  Javert: fully automatic mining of general temporal properties from dynamic traces , 2008, SIGSOFT '08/FSE-16.

[3]  Shing-Chi Cheung,et al.  Work experience versus refactoring to design patterns: a controlled experiment , 2006, SIGSOFT '06/FSE-14.

[4]  Jian Pei,et al.  MAPO: mining API usages from open source repositories , 2006, MSR '06.

[5]  Suresh Jagannathan,et al.  Path-Sensitive Inference of Function Precedence Protocols , 2007, 29th International Conference on Software Engineering (ICSE'07).

[6]  Dawson R. Engler,et al.  Bugs as deviant behavior: a general approach to inferring errors in systems code , 2001, SOSP.

[7]  Robert J. Walker,et al.  Approximate Structural Context Matching: An Approach to Recommend Relevant Examples , 2006, IEEE Transactions on Software Engineering.

[8]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[9]  James R. Larus,et al.  Mining specifications , 2002, POPL '02.

[10]  Mira Mezini,et al.  FrUiT: IDE support for framework understanding , 2006, ETX.

[11]  Chadd C. Williams,et al.  Recovering system specific rules from software repositories , 2005, MSR '05.

[12]  Tao Xie,et al.  Parseweb: a programmer assistant for reusing open source code on the web , 2007, ASE.

[13]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[14]  R. Holmes,et al.  Using structural context to recommend source code examples , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[15]  Premkumar T. Devanbu,et al.  Recommending random walks , 2007, ESEC-FSE '07.

[16]  Andreas Zeller,et al.  Detecting object usage anomalies , 2007, ESEC-FSE '07.

[17]  Christopher Scaffidi,et al.  Why are APIs difficult to learn and use? , 2006, CROS.

[18]  George C. Necula,et al.  Mining Temporal Specifications for Error Detection , 2005, TACAS.

[19]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[20]  Dana Angluin,et al.  Learning Regular Sets from Queries and Counterexamples , 1987, Inf. Comput..

[21]  Zhenmin Li,et al.  PR-Miner: automatically extracting implicit programming rules and detecting violations in large software code , 2005, ESEC/FSE-13.

[22]  Benjamin Livshits,et al.  DynaMine: finding common error patterns by mining software revision histories , 2005, ESEC/FSE-13.

[23]  Amir Michail,et al.  Data mining library reuse patterns using generalized association rules , 2000, Proceedings of the 2000 International Conference on Software Engineering. ICSE 2000 the New Millennium.

[24]  Matthew Scarpino SWT/JFace in Action , 2004 .

[25]  Jiong Yang,et al.  Finding what's not there: a new approach to revealing neglected conditions in software , 2007, ISSTA '07.

[26]  Mel Ó Cinnéide,et al.  Recommending Library Methods: An Evaluation of the Vector Space Model (VSM) and Latent Semantic Indexing (LSI) , 2006, ICSR.

[27]  Johannes Gehrke,et al.  Sequential PAttern mining using a bitmap representation , 2002, KDD.

[28]  Rastislav Bodík,et al.  Jungloid mining: helping to navigate the API jungle , 2005, PLDI '05.

[29]  Siau-Cheng Khoo,et al.  SMArTIC: towards building an accurate, robust and scalable specification miner , 2006, SIGSOFT '06/FSE-14.

[30]  Steven P. Reiss,et al.  Encoding program executions , 2001, Proceedings of the 23rd International Conference on Software Engineering. ICSE 2001.

[31]  Pavol Cerný,et al.  Synthesis of interface specifications for Java classes , 2005, POPL '05.

[32]  Manuvir Das,et al.  Perracotta: mining temporal API rules from imperfect traces , 2006, ICSE.

[33]  Kajal T. Claypool,et al.  XSnippet: mining For sample code , 2006, OOPSLA '06.

[34]  Jian Pei,et al.  Mining API patterns as partial orders from source code: from usage scenarios to specifications , 2007, ESEC-FSE '07.

[35]  Eran Yahav,et al.  Static Specification Mining Using Automata-Based Abstractions , 2007, IEEE Transactions on Software Engineering.

[36]  Thomas A. Henzinger,et al.  Permissive interfaces , 2005, ESEC/FSE-13.

[37]  Monica S. Lam,et al.  Automatic extraction of object-oriented component interfaces , 2002, ISSTA '02.