Spotting working code examples

Working code examples are useful resources for pragmatic reuse in software development. A working code example provides a solution to a specific programming problem. Earlier studies have shown that existing code search engines are not successful in finding working code examples. They fail in ranking high quality code examples at the top of the result set. To address this shortcoming, a variety of pattern-based solutions are proposed in the literature. However, these solutions cannot be integrated seamlessly in Internet-scale source code engines due to their high time complexity or query language restrictions. In this paper, we propose an approach for spotting working code examples which can be adopted by Internet-scale source code search engines. The time complexity of our approach is as low as the complexity of existing code search engines on the Internet and considerably lower than the pattern-based approaches supporting free-form queries. We study the performance of our approach using a representative corpus of 25,000 open source Java projects. Our findings support the feasibility of our approach for Internet-scale code search. We also found that our approach outperforms Ohloh Code search engine, previously known as Koders, in spotting working code examples.

[1]  Gerard Salton,et al.  On the Specification of Term Values in Automatic Indexing , 1973 .

[2]  Christian Borgelt,et al.  An implementation of the FP-growth algorithm , 2005 .

[3]  Kajal T. Claypool,et al.  XSnippet: mining For sample code , 2006, OOPSLA '06.

[4]  Jian Pei,et al.  Mining API patterns as partial orders from source code: from usage scenarios to specifications , 2007, ESEC-FSE '07.

[5]  Rastislav Bodík,et al.  Jungloid mining: helping to navigate the API jungle , 2005, PLDI '05.

[6]  Scott R. Klemmer,et al.  Example-centric programming: integrating web search into the development environment , 2010, CHI.

[7]  Martin P. Robillard,et al.  Recommendation Systems for Software Engineering , 2010, IEEE Software.

[8]  Robert J. Walker,et al.  The end-to-end use of source code examples: An exploratory study , 2009, 2009 IEEE International Conference on Software Maintenance.

[9]  Rainer Koschke Large-Scale Inter-System Clone Detection Using Suffix Trees , 2012, 2012 16th European Conference on Software Maintenance and Reengineering.

[10]  Rosalva E. Gallardo-Valencia,et al.  Internet-Scale Code Search , 2009, 2009 ICSE Workshop on Search-Driven Development-Users, Infrastructure, Tools and Evaluation.

[11]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[12]  Brenda S. Baker,et al.  A theory of parameterized pattern matching: algorithms and applications , 1993, STOC.

[13]  Katsuro Inoue,et al.  Where does this code come from and where does it go? — Integrated code history tracker for open source systems , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[14]  Collin McMillan,et al.  Portfolio: finding relevant functions and their usage , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[15]  Seung-won Hwang,et al.  Towards an Intelligent Code Search Engine , 2010, AAAI.

[16]  Sushil Krishna Bajracharya,et al.  Analyzing and mining a code search engine usage log , 2010, Empirical Software Engineering.

[17]  Eran Yahav,et al.  Typestate-based semantic code search over partial programs , 2012, OOPSLA '12.

[18]  Christian Borgelt,et al.  Frequent item set mining , 2012, WIREs Data Mining Knowl. Discov..

[19]  R. Frank,et al.  Clone detection in telecommunications software systems : a neural net approach , 1994 .

[20]  Westley Weimer,et al.  Synthesizing API usage examples , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[21]  Tao Xie,et al.  Improving software quality via code searching and mining , 2009, 2009 ICSE Workshop on Search-Driven Development-Users, Infrastructure, Tools and Evaluation.

[22]  Jian Pei,et al.  MAPO: mining API usages from open source repositories , 2006, MSR '06.

[23]  Chanchal Kumar Roy,et al.  Comparison and evaluation of code clone detection techniques and tools: A qualitative approach , 2009, Sci. Comput. Program..

[24]  Giuliano Antoniol,et al.  Comparison and Evaluation of Clone Detection Tools , 2007, IEEE Transactions on Software Engineering.

[25]  Janice Singer Practices of software maintenance , 1998, Proceedings. International Conference on Software Maintenance (Cat. No. 98CB36272).

[26]  Janet Nykaza,et al.  What programmers really want: results of a needs assessment for SDK documentation , 2002, SIGDOC '02.

[27]  Sushil Krishna Bajracharya,et al.  Leveraging usage similarity for effective retrieval of examples in code repositories , 2010, FSE '10.

[28]  Steven P. Reiss,et al.  Semantics-based code search , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[29]  Ralph E. Johnson,et al.  Documenting frameworks using patterns , 1992, OOPSLA '92.

[30]  Martin P. Robillard,et al.  What Makes APIs Hard to Learn? Answers from Developers , 2009, IEEE Software.

[31]  Robert J. Walker,et al.  Strathcona example recommendation tool , 2005, ESEC/FSE-13.

[32]  Kai Chen,et al.  Mining succinct and high-coverage API usage patterns from source code , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[33]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[34]  Tao Xie,et al.  Parseweb: a programmer assistant for reusing open source code on the web , 2007, ASE.

[35]  Koushik Sen,et al.  SNIFF: A Search Engine for Java Using Free-Form Queries , 2009, FASE.

[36]  Oleksandr Panchenko,et al.  What do developers search for in source code and why , 2011, SUITE '11.