Understanding the Usage, Impact, and Adoption of Non-OSI Approved Licenses

The software license is one of the most important non-executable pieces of any software system. However, due to its non-technical nature, developers often misuse or misunderstand software licenses. Although previous studies reported problems related to licenses clashes and inconsistencies, in this paper we shed the light on an important but yet overlooked issue: the use of non-approved open-source licenses. Such licenses claim to be open-source, but have not been formally approved by the Open Source Initiative (OSI). When a developer releases a software under a non-approved license, even if the interest is to make it open-source, the original author might not be granting the rights required by those who use the software. To uncover the reasons behind the use of non-approved licenses, we conducted a mix-method study, mining data from 657K open-source projects and their 4,367K versions, and surveying 76 developers that published some of these projects. Although 1,058,554 of the project versions employ at least one non-approved license, non-approved licenses account for 21.51% of license usage. We also observed that it is not uncommon for developers to change from a non-approved to an approved license. When asked, some developers mentioned that this transition was due to a better understanding of the disadvantages of using an non-approved license. This perspective is particularly important since developers often rely on package managers to easily and quickly get their dependencies working.

[1]  Daniela E. Damian,et al.  The promises and perils of mining GitHub , 2009, MSR 2014.

[2]  Daniel M. Germán,et al.  An Empirical Study of the Reuse of Software Licensed under the GNU General Public License , 2009, OSS.

[3]  Gabriele Bavota,et al.  When and why developers adopt and change software licenses , 2015, 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[4]  Emerson R. Murphy-Hill,et al.  Improving developer participation rates in surveys , 2013, 2013 6th International Workshop on Cooperative and Human Aspects of Software Engineering (CHASE).

[5]  Katsuro Inoue,et al.  A Method to Detect License Inconsistencies in Large-Scale Open Source Projects , 2015, 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories.

[6]  Diomidis Spinellis Package Management Systems , 2012, IEEE Software.

[7]  Daniel M. Germán,et al.  An exploratory study of the evolution of software licensing , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[8]  Daniel M. Germán,et al.  Tracing software build processes to uncover license compliance inconsistencies , 2014, ASE.

[9]  Karl Fogel,et al.  Producing open source software - how to run a successful free software project , 2005 .

[10]  Marco Tulio Valente,et al.  Why modern open source projects fail , 2017, ESEC/SIGSOFT FSE.

[11]  Katsuro Inoue,et al.  Analyzing the Relationship between the License of Packages and Their Files in Free and Open Source Software , 2014, OSS.

[12]  Katsuro Inoue,et al.  How are Developers Treating License Inconsistency Issues? A Case Study on License Inconsistency Evolution in FOSS Projects , 2017, OSS.

[13]  Ioannis E. Foukarakis,et al.  An insight into license tools for open source software systems , 2015, J. Syst. Softw..

[14]  Gabriele Bavota,et al.  License usage and changes: a large-scale study on gitHub , 2017, Empirical Software Engineering.

[15]  Philippe Suter,et al.  A Look at the Dynamics of the JavaScript Package Ecosystem , 2016, 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR).

[16]  Katsuro Inoue,et al.  Analysis of license inconsistency in large collections of open source projects , 2016, Empirical Software Engineering.

[17]  Eleni Constantinou,et al.  An empirical comparison of developer retention in the RubyGems and npm software ecosystems , 2017, Innovations in Systems and Software Engineering.

[18]  Carlos Denner dos Santos Changes in free and open source software licenses: managerial interventions and variations on project attractiveness , 2017, Journal of Internet Services and Applications.

[19]  Daniel M. Germán,et al.  A Method for Open Source License Compliance of Java Applications , 2012, IEEE Software.

[20]  Gabriele Bavota,et al.  Machine Learning-Based Detection of Open Source License Exceptions , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[21]  Daniel M. Germán,et al.  On the Variability of the BSD and MIT Licenses , 2015, OSS.

[22]  Daniel M. Germán,et al.  Understanding and Auditing the Licensing of Open Source Software Distributions , 2010, 2010 IEEE 18th International Conference on Program Comprehension.

[23]  Eirini Kalliamvakou,et al.  An in-depth study of the promises and perils of mining GitHub , 2016, Empirical Software Engineering.

[24]  Lawrence Rosen,et al.  Open Source Licensing: Software Freedom and Intellectual Property Law , 2004 .

[25]  Miryung Kim,et al.  An ethnographic study of copy and paste programming practices in OOPL , 2004, Proceedings. 2004 International Symposium on Empirical Software Engineering, 2004. ISESE '04..

[26]  Marco Aurélio Gerosa,et al.  More Common Than You Think: An In-depth Study of Casual Contributors , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[27]  Michael W. Godfrey,et al.  “Cloning considered harmful” considered harmful: patterns of cloning in software , 2008, Empirical Software Engineering.

[28]  Gail C. Murphy,et al.  Do Software Developers Understand Open Source Licenses? , 2017, 2017 IEEE/ACM 25th International Conference on Program Comprehension (ICPC).

[29]  Seung-won Hwang,et al.  Crowdsourcing Identification of License Violations , 2015, J. Comput. Sci. Eng..