Personalized project recommendation on GitHub

GitHub is a software development platform that facilitates collaboration and participation in project development. Typically, developers search for relevant projects in order to reuse functions and identify useful features. Recommending suitable projects for developers can save their time. However, finding suitable projects among many projects on GitHub is difficult. In addition, different users may have different requirements. A recommendation system would help developers by reducing the time required to find suitable projects. In this paper, we propose an approach to recommend projects that considers developer behaviors and project features. The proposed approach automatically recommends the top-N most relevant software projects. We also integrate user feedback to improve recommendation accuracy. The results of an empirical study using data crawled from GitHub demonstrate that the proposed approach can efficiently recommend relevant software projects with relatively high precision.

[1]  Bin Li,et al.  Modeling the evolution of development topics using Dynamic Topic Models , 2015, 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[2]  David Lo,et al.  Why and how developers fork what from whom in GitHub , 2017, Empirical Software Engineering.

[3]  Zhi-Dan Zhao,et al.  User-Based Collaborative-Filtering Recommendation Algorithms on Hadoop , 2010, 2010 Third International Conference on Knowledge Discovery and Data Mining.

[4]  Bin Li,et al.  Mining Software Repositories for Automatic Interface Recommendation , 2016, Sci. Program..

[5]  Jun Wang,et al.  Unifying user-based and item-based collaborative filtering approaches by similarity fusion , 2006, SIGIR.

[6]  Collin McMillan,et al.  Detecting similar software applications , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[7]  David Lo,et al.  Detecting similar repositories on GitHub , 2017, 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[8]  Bing Xie,et al.  Recommending relevant projects via user behaviour: an exploratory study on github , 2014, CrowdSoft 2014.

[9]  Junwu Zhu,et al.  Empirical studies on the NLP techniques for source code data preprocessing , 2014, EAST 2014.

[10]  Premkumar T. Devanbu,et al.  A large scale study of programming languages and code quality in github , 2014, SIGSOFT FSE.

[11]  Gang Yin,et al.  RepoLike: personal repositories recommendation in social coding communities , 2016, Internetware.

[12]  Bin Li,et al.  WB4SP: A tool to build the word base for specific programs , 2016, 2016 IEEE 24th International Conference on Program Comprehension (ICPC).

[13]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[14]  David Lo,et al.  Detecting similar applications with collaborative tagging , 2012, 2012 28th IEEE International Conference on Software Maintenance (ICSM).

[15]  Kelly Blincoe,et al.  Understanding the popular users: Following, affiliation influence and leadership on GitHub , 2016, Inf. Softw. Technol..

[16]  Xiaobing Sun,et al.  Enhancing developer recommendation with supplementary information via mining historical commits , 2017, J. Syst. Softw..

[17]  John Riedl,et al.  Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[18]  Oscar Firschein,et al.  Readings in computer vision: issues, problems, principles, and paradigms , 1987 .

[19]  Wenyuan Xu,et al.  Scalable Relevant Project Recommendation on GitHub , 2017, Internetware.

[20]  Wenyuan Xu,et al.  REPERSP: Recommending Personalized Software Projects on GitHub , 2017, 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME).