Learning to rank code examples for code search engines

Source code examples are used by developers to implement unfamiliar tasks by learning from existing solutions. To better support developers in finding existing solutions, code search engines are designed to locate and rank code examples relevant to user’s queries. Essentially, a code search engine provides a ranking schema, which combines a set of ranking features to calculate the relevance between a query and candidate code examples. Consequently, the ranking schema places relevant code examples at the top of the result list. However, it is difficult to determine the configurations of the ranking schemas subjectively. In this paper, we propose a code example search approach that applies a machine learning technique to automatically train a ranking schema. We use the trained ranking schema to rank candidate code examples for new queries at run-time. We evaluate the ranking performance of our approach using a corpus of over 360,000 code snippets crawled from 586 open-source Android projects. The performance evaluation study shows that the learning-to-rank approach can effectively rank code examples, and outperform the existing ranking schemas by about 35.65 % and 48.42 % in terms of normalized discounted cumulative gain (NDCG) and expected reciprocal rank (ERR) measures respectively.

[1]  Tie-Yan Liu,et al.  Learning to Rank for Information Retrieval , 2011 .

[2]  Xueqi Cheng,et al.  Top-k learning to rank: labeling, ranking and evaluation , 2012, SIGIR '12.

[3]  Sushil Krishna Bajracharya,et al.  Sourcerer: a search engine for open source code supporting structure-based search , 2006, OOPSLA '06.

[4]  Ahmed E. Hassan,et al.  Understanding the impact of code and process metrics on post-release defects: a case study on the Eclipse project , 2010, ESEM '10.

[5]  Peter Harrington,et al.  Machine Learning in Action , 2012 .

[6]  N. Nagappan,et al.  Use of relative code churn measures to predict system defect density , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[7]  Seung-won Hwang,et al.  Towards an Intelligent Code Search Engine , 2010, AAAI.

[8]  Brad A. Myers,et al.  Improving API documentation using API usage information , 2009, 2009 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC).

[9]  Gösta Grahne,et al.  Efficiently Using Prefix-trees in Mining Frequent Itemsets , 2003, FIMI.

[10]  Zhendong Su,et al.  On the naturalness of software , 2012, ICSE 2012.

[11]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[12]  Rosalva E. Gallardo-Valencia,et al.  Internet-Scale Code Search , 2009, 2009 ICSE Workshop on Search-Driven Development-Users, Infrastructure, Tools and Evaluation.

[13]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[14]  Michael J. Campbell,et al.  Statistics at Square One , 1976, British medical journal.

[15]  Thomas G. Moher,et al.  Some strategies of reuse in an object-oriented programming environment , 1989, CHI '89.

[16]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[17]  Martin P. Robillard,et al.  A field study of API learning obstacles , 2011, Empirical Software Engineering.

[18]  David J. Sheskin,et al.  Handbook of Parametric and Nonparametric Statistical Procedures , 1997 .

[19]  Koby Crammer,et al.  Pranking with Ranking , 2001, NIPS.

[20]  Jane Cleland-Huang,et al.  Improving trace accuracy through data-driven configuration and composition of tracing features , 2013, ESEC/FSE 2013.

[21]  Hang Li,et al.  AdaRank: a boosting algorithm for information retrieval , 2007, SIGIR.

[22]  Eran Yahav,et al.  Typestate-based semantic code search over partial programs , 2012, OOPSLA '12.

[23]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[24]  Cristina V. Lopes,et al.  Software reuse through methodical component reuse and amethodical snippet remixing , 2012, CSCW.

[25]  M. Marcelli,et al.  Design and Methods , 2011 .

[26]  Sushil Krishna Bajracharya,et al.  Leveraging usage similarity for effective retrieval of examples in code repositories , 2010, FSE '10.

[27]  Philip J. Guo,et al.  Two studies of opportunistic programming: interleaving web foraging, learning, and writing code , 2009, CHI.

[28]  Steven P. Reiss,et al.  Semantics-based code search , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[29]  Peter Bailey,et al.  Does brandname influence perceived search result quality? Yahoo! , 2007 .

[30]  Michael W. Godfrey,et al.  "Cloning Considered Harmful" Considered Harmful , 2006, 2006 13th Working Conference on Reverse Engineering.

[31]  Hinrich Schütze,et al.  Scoring , term weighting and thevector space model , 2015 .

[32]  Razvan C. Bunescu,et al.  Learning to rank relevant files for bug reports using domain knowledge , 2014, SIGSOFT FSE.

[33]  P. Jaccard,et al.  Etude comparative de la distribution florale dans une portion des Alpes et des Jura , 1901 .

[34]  Hang Li,et al.  A Short Introduction to Learning to Rank , 2011, IEICE Trans. Inf. Syst..

[35]  Kai Chen,et al.  Mining succinct and high-coverage API usage patterns from source code , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[36]  David Hawking,et al.  Evaluation by comparing result sets in context , 2006, CIKM '06.

[37]  Westley Weimer,et al.  Learning a Metric for Code Readability , 2010, IEEE Transactions on Software Engineering.

[38]  Andrea De Lucia,et al.  How to effectively use topic models for software engineering tasks? An approach based on Genetic Algorithms , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[39]  N. Cliff Dominance statistics: Ordinal analyses to answer ordinal questions. , 1993 .

[40]  Robert J. Walker,et al.  The end-to-end use of source code examples: An exploratory study , 2009, 2009 IEEE International Conference on Software Maintenance.

[41]  Premkumar T. Devanbu,et al.  On the naturalness of software , 2016, Commun. ACM.

[42]  Qiang Wu,et al.  McRank: Learning to Rank Using Multiple Classification and Gradient Boosting , 2007, NIPS.

[43]  Ying Zou,et al.  Spotting working code examples , 2014, ICSE.

[44]  Thore Graepel,et al.  Large Margin Rank Boundaries for Ordinal Regression , 2000 .

[45]  Martin P. Robillard,et al.  Selection and presentation practices for code example summarization , 2014, SIGSOFT FSE.

[46]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[47]  Collin McMillan,et al.  Portfolio: Searching for relevant functions and their usages in millions of lines of code , 2013, TSEM.

[48]  Hinrich Schütze,et al.  Introduction to Information Retrieval: Scoring, term weighting, and the vector space model , 2008 .

[49]  Westley Weimer,et al.  Synthesizing API usage examples , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[50]  Jian Pei,et al.  MAPO: Mining and Recommending API Usage Patterns , 2009, ECOOP.

[51]  Robert J. Walker,et al.  Systematizing pragmatic software reuse , 2012, TSEM.

[52]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[53]  Terry L King A Guide to Chi-Squared Testing , 1997 .

[54]  Collin McMillan,et al.  Exemplar: A Source Code Search Engine for Finding Highly Relevant Applications , 2012, IEEE Transactions on Software Engineering.

[55]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[56]  Jian Zhou,et al.  Learning to rank duplicate bug reports , 2012, CIKM.

[57]  Olivier Chapelle,et al.  Expected reciprocal rank for graded relevance , 2009, CIKM.

[58]  Robert J. Walker,et al.  Strathcona example recommendation tool , 2005, ESEC/FSE-13.

[59]  Tao Xie,et al.  Parseweb: a programmer assistant for reusing open source code on the web , 2007, ASE.

[60]  David W. Binkley,et al.  Learning to Rank Improves IR in SE , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[61]  Tie-Yan Liu,et al.  Learning to rank: from pairwise approach to listwise approach , 2007, ICML '07.

[62]  Martin Monperrus,et al.  Learning to Combine Multiple Ranking Metrics for Fault Localization , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[63]  Mira Mezini,et al.  On evaluating recommender systems for API usages , 2008, RSSE '08.