Are Software Dependency Supply Chain Metrics Useful in Predicting Change of Popularity of NPM Packages?

Background: As software development becomes more interdependent, unique relationships among software packages arise and form complex software ecosystems. Aim: We aim to understand the behavior of these ecosystems better through the lens of software supply chains and model how the effects of software dependency network affect the change in downloads of Javascript packages. Method: We analyzed 12,999 popular packages in NPM, between 01-December-2017 and 15-March-2018, using Linear Regression and Random Forest models and examined the effects of predictors representing different aspects of the software dependency supply chain on changes in numbers of downloads for a package. Result: Preliminary results suggest that the count and downloads of upstream and downstream runtime dependencies have a strong effect on the change in downloads, with packages having fewer, more popular packages as dependencies (upstream or downstream) likely to see an increase in downloads. This suggests that in order to interpret the number of downloads for a package properly, it is necessary to take into account the peculiarities of the supply chain (both upstream and downstream) of that package. Conclusion: Future work is needed to identify the effects of added, deleted, and unchanged dependencies for different types of packages, e.g. build tools, test tools.

[1]  Audris Mockus,et al.  Patterns of folder use and project popularity: a case study of github repositories , 2014, ESEM '14.

[2]  A.A. Chhajed,et al.  Software focused supply chains: challenges and issues , 2005, INDIN '05. 2005 3rd IEEE International Conference on Industrial Informatics, 2005..

[3]  Galit Shmueli,et al.  To Explain or To Predict? , 2010 .

[4]  Elias Levy Poisoning the Software Supply Chain , 2003, IEEE Secur. Priv..

[5]  Eleni Constantinou,et al.  An Empirical Analysis of Technical Lag in npm Package Dependencies , 2018, ICSR.

[6]  Anthony Finkelstein,et al.  Exploiting software supply chain business architecture: a research agenda , 1999, ICSE 1999.

[7]  Marco Tulio Valente,et al.  Understanding the Factors That Impact the Popularity of GitHub Repositories , 2016, 2016 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[8]  Philippe Suter,et al.  A Look at the Dynamics of the JavaScript Package Ecosystem , 2016, 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR).

[9]  Rabe Abdalkareem,et al.  Why do developers use trivial packages? an empirical case study on npm , 2017, ESEC/SIGSOFT FSE.

[10]  Jack Greenfield,et al.  Software factories: assembling applications with patterns, models, frameworks and tools , 2004, OOPSLA '03.

[11]  Sangaralingam Kajanan,et al.  Do App Launch Times Impact their Subsequent Commercial Success? An Analytical Approach , 2013, 2013 International Conference on Cloud Computing and Big Data.

[12]  Jiebo Luo,et al.  What Makes an Open Source Code Popular on Git Hub? , 2014, 2014 IEEE International Conference on Data Mining Workshop.

[13]  Jacqueline Holdsworth Software Process Design , 1995 .

[14]  David Lo,et al.  Popularity, Interoperability, and Impact of Programming Languages in 100,000 Open Source Projects , 2013, 2013 IEEE 37th Annual Computer Software and Applications Conference.