Opportunities and Challenges Applying Functional Data Analysis to the Study of Open Source Software Evolution

This paper explores the application of functional data analy sis (FDA) as a means to study the dynamics of software evolution in the open source context. Several challenges in analyzing the data from software projects are discussed, an approach to overcoming those challenges is de scribed, and preliminary results from the analysis of a sample of open source software (OSS) projects are provided. The results demonstrate the utility of FDA for uncovering and categorizing multiple distinct patterns of evolution in the complexity of OSS projects. These results are promising in that they demonstrate some patterns in which the complexity of software decreased as the software grew in size, a particularly novel result. The paper reports pre liminary explorations of factors that may be associated with decreasing com plexity patterns in these projects. The paper concludes by describing several next steps for this research project as well as some questions for which more sophisticated analytical techniques may be needed.

[1]  Likoebe M. Maruping,et al.  Impacts of License Choice and Organizational Sponsorship on User Interest and Development Activity in Open Source Software Projects , 2006, Inf. Syst. Res..

[2]  Gordon B. Davis,et al.  Software Development Practices, Software Complexity, and Software Maintenance Performance: a Field Study , 1998 .

[3]  Meir M. Lehman,et al.  A Model of Large Program Development , 1976, IBM Syst. J..

[4]  Chris F. Kemerer,et al.  Software complexity and software maintenance: A survey of empirical research , 1995, Ann. Softw. Eng..

[5]  Yong Tan,et al.  Comparing uniform and flexible policies for software maintenance and replacement , 2005, IEEE Transactions on Software Engineering.

[6]  Katherine J. Stewart,et al.  The Impact of Ideology on Effectiveness in Open Source Software Development Teams , 2006, MIS Q..

[7]  Walt Scacchi,et al.  Understanding the requirements for developing open source software systems , 2002, IEE Proc. Softw..

[8]  Alan MacCormack,et al.  Exploring the Structure of Complex Software Designs: An Empirical Study of Open Source and Proprietary Code , 2006, Manag. Sci..

[9]  Peter J. Rousseeuw,et al.  Clustering by means of medoids , 1987 .

[10]  C. Prahalad,et al.  The new meaning of quality in the information age. , 1999, Harvard business review.

[11]  Narasimhaiah Gorla,et al.  Effect of Software Structure Attributes on Software Development Productivity , 1997, J. Syst. Softw..

[12]  T. Auton Applied Functional Data Analysis: Methods and Case Studies , 2004 .

[13]  David P. Darcy,et al.  Managerial Use of Metrics for Object-Oriented Software: An Exploratory Analysis , 1998, IEEE Trans. Software Eng..

[14]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[15]  Stephen R. Schach,et al.  Categorization of common coupling and its application to the maintainability of the Linux kernel , 2004, IEEE Transactions on Software Engineering.

[16]  Wolfgang Jank,et al.  Profiling Price Dynamics in Online Auctions Using Curve Clustering , 2005 .

[17]  Chris F. Kemerer,et al.  An Empirical Approach to Studying Software Evolution , 1999, IEEE Trans. Software Eng..

[18]  James E. Tomayko,et al.  The structural complexity of software an experimental test , 2005, IEEE Transactions on Software Engineering.