SENSORY: Leveraging Code Statement Sequence Information for Code Snippets Recommendation

Software developers often have to implement unfamiliar programming tasks. When faced with these problems, developers often search online for code snippets as references to learn how to solve the unfamiliar tasks. In recent years, some researchers propose several approaches to use programming context to recommend code snippets. Most of these approaches use information retrieval based techniques and treat code snippets as a set of tokens. However, in code, the smallest meaningful unit is code statement, in general, the line of code. Since these studies did not consider this issue, there is still room for improvement in the code snippets recommendation. In this paper, we propose a code Statement sEquence iNformation baSed cOde snippets Recommendation sYstem (SENSORY). Different from existing token based approaches, SENSORY performs code snippets recommendation at code statement granularity. It uses the Burrows Wheeler Transform algorithm to search relevant code snippets, and uses the structure information to re-rank the results. To evaluate the effectiveness of our proposed method, we construct a code database with 1000000 real world code snippets which contain more than 15000000 lines of code. The experimental results show that SENSORY outperforms the two strong baseline work in terms of precision and NDCG.

[1]  Tao Xie,et al.  SpotWeb: Detecting Framework Hotspots and Coldspots via Mining Open Source Code on the Web , 2008, 2008 23rd IEEE/ACM International Conference on Automated Software Engineering.

[2]  Robert J. Walker,et al.  Strathcona example recommendation tool , 2005, ESEC/FSE-13.

[3]  Sushil Krishna Bajracharya,et al.  Mining search topics from a code search engine usage log , 2009, 2009 6th IEEE International Working Conference on Mining Software Repositories.

[4]  Hidehiko Masuhara,et al.  A spontaneous code recommendation tool based on associative search , 2011, SUITE '11.

[5]  Ying Zou,et al.  Spotting working code examples , 2014, ICSE.

[6]  Tao Xie,et al.  Parseweb: a programmer assistant for reusing open source code on the web , 2007, ASE.

[7]  Paulo Gomes,et al.  Context-Based Search, Recommendation and Browsing in Software Development , 2014, Context in Computing.

[8]  D. J. Wheeler,et al.  A Block-sorting Lossless Data Compression Algorithm , 1994 .

[9]  Sushil Krishna Bajracharya,et al.  Mining Internet-Scale Software Repositories , 2007, NIPS.

[10]  Trong Duc Nguyen,et al.  Exploring API Embedding for API Usages and Applications , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[11]  Collin McMillan,et al.  Exemplar: A Source Code Search Engine for Finding Highly Relevant Applications , 2012, IEEE Transactions on Software Engineering.

[12]  Gail C. Murphy,et al.  Using structural context to recommend source code examples , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[13]  Zhenchang Xing,et al.  Clone-based and interactive recommendation for modifying pasted code , 2015, ESEC/SIGSOFT FSE.

[14]  Zhenchang Xing,et al.  API Method Recommendation without Worrying about the Task-API Knowledge Gap , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[15]  Chanchal Kumar Roy,et al.  RACK: Automatic API Recommendation Using Crowdsourced Knowledge , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[16]  Hidehiko Masuhara,et al.  Optimizing a search-based code recommendation system , 2012, 2012 Third International Workshop on Recommendation Systems for Software Engineering (RSSE).

[17]  LiGuo Huang,et al.  Effective API Recommendation without Historical Software Repositories , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[18]  Dongmei Zhang,et al.  CodeHow: Effective Code Search Based on API Understanding and Extended Boolean Model (E) , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[19]  Collin McMillan,et al.  Portfolio: finding relevant functions and their usage , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[20]  Gabriele Bavota,et al.  Supporting Software Developers with a Holistic Recommender System , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[21]  Sushil Krishna Bajracharya,et al.  Analyzing and mining a code search engine usage log , 2010, Empirical Software Engineering.

[22]  Tao Zhang,et al.  ROSF: Leveraging Information Retrieval and Supervised Learning for Recommending Code Snippets , 2017, IEEE Transactions on Services Computing.

[23]  Martin P. Robillard,et al.  Recommendation Systems for Software Engineering , 2010, IEEE Software.

[24]  Collin McMillan,et al.  Portfolio: Searching for relevant functions and their usages in millions of lines of code , 2013, TSEM.

[25]  Jun Sun,et al.  Mining implicit design templates for actionable code reuse , 2017, 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[26]  Steven P. Reiss,et al.  Semantics-based code search , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[27]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[28]  Chanchal Kumar Roy,et al.  On the Use of Context in Recommending Exception Handling Code Examples , 2014, 2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation.

[29]  Katsuro Inoue,et al.  Where does this code come from and where does it go? — Integrated code history tracker for open source systems , 2012, 2012 34th International Conference on Software Engineering (ICSE).