Mining Collaboration Patterns from a Large Developer Network

In this study, we extract patterns from a large developer collaborations network extracted from Source Forge. Net at high and low level of details. At the high level of details, we extract various network-level statistics from the network. At the low level of details, we extract topological sub-graph patterns that are frequently seen among collaborating developers. Extracting sub graph patterns from large graphs is a hard NP-complete problem. To address this challenge, we employ a novel combination of graph mining and graph matching by leveraging network-level properties of a developer network. With the approach, we successfully analyze a snapshot of Source Forge. Net data taken on September 2009. We present mined patterns and describe interesting observations.

[1]  Walt Scacchi,et al.  Open Source Software Development , 2011 .

[2]  Anita Sarma,et al.  Tesseract: Interactive visual exploration of socio-technical relationships in software development , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[3]  James D. Herbsleb,et al.  Global software development at siemens: experience from nine projects , 2005, ICSE.

[4]  James D. Herbsleb,et al.  Communication networks in geographically distributed software development , 2008, CSCW.

[5]  Rajesh Krishna Balan,et al.  Globally distributed software development project performance: an empirical analysis , 2008, ISEC '08.

[6]  Hongyu Zhang,et al.  Discovering power laws in computer programs , 2009, Inf. Process. Manag..

[7]  Mark S. Granovetter The Strength of Weak Ties , 1973, American Journal of Sociology.

[8]  Michele Marchesi,et al.  Power-Laws in a Large Object-Oriented Software System , 2007, IEEE Transactions on Software Engineering.

[9]  Jiawei Han,et al.  CloseGraph: mining closed frequent graph patterns , 2003, KDD '03.

[10]  David F. Redmiles,et al.  Supporting collaborative software development through the visualization of socio-technical dependencies , 2007, GROUP.

[11]  Greg Madey,et al.  THE OPEN SOURCE SOFTWARE DEVELOPMENT PHENOMENON: AN ANALYSIS BASED ON SOCIAL NETWORK THEORY , 2002 .

[12]  Jin Xu,et al.  A Topological Analysis of the Open Souce Software Development Community , 2005, Proceedings of the 38th Annual Hawaii International Conference on System Sciences.

[13]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[14]  Michele Lanza,et al.  Reverse Engineering Super-Repositories , 2007, 14th Working Conference on Reverse Engineering (WCRE 2007).

[15]  Michael Gertz,et al.  Mining email social networks , 2006, MSR '06.