Improving Open Source Software Process Quality Based on Defect Data Mining

Open Source Software (OSS) project managers often need to observe project key indicators, e.g., how much efforts are needed to finish certain tasks, to assess and improve project and product quality, e.g., by analyzing defect data from OSS project developer activities. Previous work was based on analyzing defect data of OSS projects by using correlation analysis approach for defect prediction on a combination of product and process metrics. However, this correlation analysis is focusing on the relationship between two variables without exploring the characterization of that relationship. We propose an observation framework that explores the relationship of OSS defect metrics by using data mining approach (heuristics mining algorithm). Major results show that our framework can support OSS project managers in observing project key indicators, e.g., by checking conformance between the designed and actual process models.

[1]  Walt Scacchi,et al.  A meta-model for formulating knowledge-based models of software development , 1996, Decision Support Systems.

[2]  Stefan Biffl,et al.  Introducing "HEALTH" Perspective in Open Source Web-Enginerring Software Projects Based on Project Data Analysis , 2006, iiWAS.

[3]  Swapna S. Gokhale,et al.  Linux Bugs: Life Cycle and Resolution Analysis , 2008, 2008 The Eighth International Conference on Quality Software.

[4]  Vijayan Sugumaran,et al.  A framework for creating hybrid‐open source software communities , 2002, Inf. Syst. J..

[5]  J. Herbsleb,et al.  Two case studies of open source software development: Apache and Mozilla , 2002, TSEM.

[6]  Tao Xie,et al.  Identifying security bug reports via text mining: An industrial case study , 2010, 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010).

[7]  John Scott What is social network analysis , 2010 .

[8]  Audris Mockus,et al.  A case study of open source software development: the Apache server , 2000, Proceedings of the 2000 International Conference on Software Engineering. ICSE 2000 the New Millennium.

[9]  Stefan Biffl,et al.  Monitoring the "health" status of open source web-engineering projects , 2007, Int. J. Web Inf. Syst..

[10]  Wil M. P. van der Aalst,et al.  Workflow mining: discovering process models from event logs , 2004, IEEE Transactions on Knowledge and Data Engineering.

[11]  Joan Rigat Data mining analysis of defect data in Software Development Process , 2009 .

[12]  Stefan Biffl,et al.  Semantic Integration of Heterogeneous Data Sources for Monitoring Frequent-Release Software Projects , 2010, 2010 International Conference on Complex, Intelligent and Software Intensive Systems.

[13]  Stefan Biffl,et al.  A Project Monitoring Cockpit Based On Integrating Data Sources in Open Source Software Development , 2010, SEKE.

[14]  Michael J. A. Berry,et al.  Data mining techniques - for marketing, sales, and customer support , 1997, Wiley computer publishing.

[15]  Wil M.P. van der Aalst,et al.  Process mining with the HeuristicsMiner algorithm , 2006 .

[16]  Boudewijn F. van Dongen,et al.  A Meta Model for Process Mining Data , 2005, EMOI-INTEROP.

[17]  Boudewijn F. van Dongen,et al.  Workflow mining: A survey of issues and approaches , 2003, Data Knowl. Eng..

[18]  Wolfgang Reisig,et al.  Lectures on Petri Nets I: Basic Models , 1996, Lecture Notes in Computer Science.