A modified Yule process to model the evolution of some object-oriented system properties

We present a model based on the Yule process, able to explain the evolution of some properties of large object-oriented software systems. We study four system properties related to code production of four large object-oriented software systems - Eclipse, Netbeans, JDK and Ant. The properties analysed, namely the naming of variables and methods, the call to methods and the inheritance hierarchies, show a power-law distribution as reported in previous papers for different systems. We use the simulation approach to verify the goodness of our model, finding a very good correspondence between empirical data of subsequent software versions, and the prediction of the model presented.

[1]  Elaine J. Weyuker,et al.  Predicting the location and number of faults in large software systems , 2005, IEEE Transactions on Software Engineering.

[2]  Elaine J. Weyuker,et al.  The distribution of faults in a large industrial software system , 2002, ISSTA '02.

[3]  Katsuro Inoue,et al.  An Exploration of Power-Law in Use-Relation of Java Software Systems , 2008 .

[4]  Christopher R. Myers,et al.  Software systems as complex networks: structure, function, and evolvability of software collaboration graphs , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[5]  Chris F. Kemerer,et al.  A Metrics Suite for Object Oriented Design , 2015, IEEE Trans. Software Eng..

[6]  Derek de Solla Price,et al.  A general theory of bibliometric and other cumulative advantage processes , 1976, J. Am. Soc. Inf. Sci..

[7]  Michael Mitzenmacher,et al.  A Brief History of Generative Models for Power Law and Lognormal Distributions , 2004, Internet Math..

[8]  James Noble,et al.  Scale-free geometry in OO programs , 2005, CACM.

[9]  Michel L. Goldstein,et al.  Problems with fitting to the power-law distribution , 2004, cond-mat/0402322.

[10]  Sergi Valverde,et al.  Hierarchical Small Worlds in Software Architecture , 2003 .

[11]  Norman E. Fenton,et al.  Quantitative Analysis of Faults and Failures in a Complex Software System , 2000, IEEE Trans. Software Eng..

[12]  H. Seal The Maximum Likelihood Fitting of the Discrete Pareto Law , 1952 .

[13]  B. M. Hill,et al.  A Simple General Approach to Inference About the Tail of a Distribution , 1975 .

[14]  Ying-Cheng Lai,et al.  Signatures of small-world and scale-free properties in large computer programs , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[15]  Per Runeson,et al.  A Replicated Quantitative Analysis of Fault Distributions in Complex Software Systems , 2007, IEEE Transactions on Software Engineering.

[16]  H. Stanley,et al.  Preferential Attachment and Growth Dynamics in Complex Systems , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[17]  Kai-Yuan Cai,et al.  Software execution processes as an evolving complex network , 2009, Inf. Sci..

[18]  James Noble,et al.  Scale-free Geometry in Object-Oriented Programs , 2004 .

[19]  O. Weis,et al.  Determination of attenuation and phase velocity of hypersound by multiple-beam interferometry in evaporated wedge-shaped metal films , 1981 .

[20]  Diomidis Spinellis,et al.  Power laws in software , 2008, TSEM.

[21]  A. A. Gorshenev,et al.  Punctuated Equilibrium in Software Evolution , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[22]  Hongyu Zhang On the Distribution of Software Faults , 2008, IEEE Transactions on Software Engineering.

[23]  H. Simon,et al.  ON A CLASS OF SKEW DISTRIBUTION FUNCTIONS , 1955 .

[24]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[25]  Giancarlo Succi,et al.  A stochastic model of software maintenance and its implications on extreme programming processes , 2000 .

[26]  Gareth Baxter,et al.  Software graphs and programmer awareness , 2008, ArXiv.

[27]  Takako Nakatani,et al.  Analysis of software evolution processes using statistical distribution Models , 2002, IWPSE '02.

[28]  R. Ferrer i Cancho,et al.  Scale-free networks from optimal design , 2002, cond-mat/0204344.

[29]  Steve Counsell,et al.  Power law distributions in class relationships , 2003, Proceedings Third IEEE International Workshop on Source Code Analysis and Manipulation.

[30]  H. Bauke Parameter estimation for power-law distributions by maximum likelihood methods , 2007, 0704.1867.

[31]  M. E. J. Newman,et al.  Power laws, Pareto distributions and Zipf's law , 2005 .

[32]  M. Marchesi,et al.  On the suitability of Yule process to stochastically model some properties of object-oriented systems , 2006 .

[33]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[34]  Michele Marchesi,et al.  Power-Laws in a Large Object-Oriented Software System , 2007, IEEE Transactions on Software Engineering.

[35]  Michele Marchesi,et al.  Power Laws in Smalltalk , 2004 .

[36]  Per Runeson,et al.  A Second Replicated Quantitative Analysis of Fault Distributions in Complex Software Systems , 2007, IEEE Transactions on Software Engineering.