A large scale study of programming languages and code quality in github

What is the effect of programming languages on software quality? This question has been a topic of much debate for a very long time. In this study, we gather a very large data set from GitHub (729 projects, 80 Million SLOC, 29,000 authors, 1.5 million commits, in 17 languages) in an attempt to shed some empirical light on this question. This reasonably large sample size allows us to use a mixed-methods approach, combining multiple regression modeling with visualization and text analytics, to study the effect of language features such as static v.s. dynamic typing, strong v.s. weak typing on software quality. By triangulating findings from different methods, and controlling for confounding effects such as team size, project size, and project history, we report that language design does have a significant, but modest effect on software quality. Most notably, it does appear that strong typing is modestly better than weak typing, and among functional languages, static typing is also somewhat better than dynamic typing. We also find that functional languages are somewhat better than procedural languages. It is worth noting that these modest effects arising from language design are overwhelmingly dominated by the process factors such as project size, team size, and commit size. However, we hasten to caution the reader that even these modest effects might quite possibly be due to other, intangible process factors, e.g., the preference of certain personality types for functional, static and strongly typed languages.

[1]  Tomas Petricek,et al.  Real World Functional Programming , 2010 .

[2]  Joaquim P. Marques de Sá,et al.  Applied statistics : using SPSS, STATISTICA, and MATLAB , 2003 .

[3]  Victor Pankratius,et al.  Combining functional and imperative programming for multicore software: An empirical study evaluating Scala and Java , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[4]  Joe Armstrong,et al.  Concurrent programming in ERLANG , 1993 .

[5]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[6]  Tomas Petricek,et al.  Real-World Functional Programming: With Examples in F# and C# , 2010 .

[7]  Elaine J. Weyuker,et al.  Do too many cooks spoil the broth? Using the number of developers to enhance defect prediction models , 2008, Empirical Software Engineering.

[8]  David B. Dunson,et al.  Probabilistic topic models , 2011, KDD '11 Tutorials.

[9]  Stefan Hanenberg,et al.  An experiment about static and dynamic type systems: doubts about the positive impact of static type systems on development time , 2010, OOPSLA.

[10]  Khaled El Emam,et al.  The Confounding Effect of Class Size on the Validity of Object-Oriented Metrics , 2001, IEEE Trans. Software Eng..

[11]  Kathryn S. McKinley,et al.  Cork: dynamic memory leak detection for garbage-collected languages , 2007, POPL '07.

[12]  HanenbergStefan,et al.  An empirical study of the influence of static type systems on the usability of undocumented software , 2012 .

[13]  Adam A. Porter,et al.  An experiment to assess different defect detection methods for software requirements inspections , 1994, Proceedings of 16th International Conference on Software Engineering.

[14]  Stefan Hanenberg,et al.  An empirical study of the influence of static type systems on the usability of undocumented software , 2012, OOPSLA '12.

[15]  Leo A. Meyerovich,et al.  Empirical analysis of programming language adoption , 2013, OOPSLA.

[16]  Premkumar T. Devanbu,et al.  An empirical study on the influence of pattern roles on change-proneness , 2010, Empirical Software Engineering.

[17]  Harald C. Gall,et al.  Don't touch my code!: examining the effects of ownership on software quality , 2011, ESEC/FSE '11.

[18]  Premkumar T. Devanbu,et al.  How, and why, process metrics are better , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[19]  Q. Vuong Likelihood Ratio Tests for Model Selection and Non-Nested Hypotheses , 1989 .

[20]  R. Hindley The Principal Type-Scheme of an Object in Combinatory Logic , 1969 .

[21]  Martin Odersky,et al.  Programming in Scala , 2008 .

[22]  J. H. Zar,et al.  Significance Testing of the Spearman Rank Correlation Coefficient , 1972 .

[23]  H. Cramér Mathematical methods of statistics , 1947 .

[24]  Audris Mockus,et al.  Identifying reasons for software changes using historic databases , 2000, Proceedings 2000 International Conference on Software Maintenance.

[25]  Robin Milner,et al.  A Theory of Type Polymorphism in Programming , 1978, J. Comput. Syst. Sci..

[26]  Iulian Neamtiu,et al.  Assessing programming language impact on development and maintenance: a study on c and c++ , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[27]  A MeyerovichLeo,et al.  Empirical analysis of programming language adoption , 2013 .

[28]  Mike Williams,et al.  ERLANG for Concurrent Programming , 1993 .

[29]  Jacob Cohen,et al.  Applied multiple regression/correlation analysis for the behavioral sciences , 1979 .

[30]  Yuanyuan Zhou,et al.  Bug characteristics in open source software , 2013, Empirical Software Engineering.

[31]  Yuanyuan Zhou,et al.  Have things changed now?: an empirical study of bug characteristics in modern open source software , 2006, ASID '06.

[32]  Mayuram S. Krishnan,et al.  Effects of Process Maturity on Quality, Cycle Time, and Effort in Software Product Development , 2000 .

[33]  Benjamin C. Pierce,et al.  Types and programming languages: the next generation , 2003, 18th Annual IEEE Symposium of Logic in Computer Science, 2003. Proceedings..

[34]  Paul H. Lewis,et al.  Comparing programming paradigms: an evaluation of functional and object-oriented programs , 1996, Softw. Eng. J..

[35]  HanenbergStefan An experiment about static and dynamic type systems , 2010 .

[36]  Sebastian Kleinschmager,et al.  Do static type systems improve the maintainability of software systems? An empirical study , 2012, 2012 20th IEEE International Conference on Program Comprehension (ICPC).

[37]  Daniela E. Damian,et al.  Selecting Empirical Methods for Software Engineering Research , 2008, Guide to Advanced Empirical Software Engineering.