Collaboration Process Pattern Approach to Improving Teamwork Performance: A Data Mining-Based Methodology

It is well documented in management literature that characteristics of collaboration processes strongly influence team performance in a business environment. However, little work has been done on how specific collaboration process patterns affect teamwork performance, leading to an open issue in collaboration management. To address this research gap, we develop a Collaboration Process Pattern CPP approach that analyzes teamwork performance by mining collaboration system logs from open source software development. Our research is novel in three ways. First, our research is fact-driven, as the result is based on teamwork tracking logs. Second, we develop a pattern mining approach based on sequence mining and graph mining. Third, using time-dependent Cox regression, our approach derives business insights from real-world collaboration data that are directly applicable to managerial actions. Our empirical study identifies collaboration patterns that can lead to more efficient teamwork. It also shows that the effects of collaboration patterns vary depending on the types of tasks. These findings are of significant business value since they suggest that managers should carefully prioritize their limited attention on certain types of tasks for intervention. Data and the online supplement are available at https://doi.org/10.1287/ijoc.2016.0739 .

[1]  Jing Zhou When the presence of creative coworkers is related to creativity: role of supervisor close monitoring, developmental feedback, and creative personality. , 2003, The Journal of applied psychology.

[2]  Ronald Maier,et al.  Macrocognition in Collaboration: Analyzing Processes of Team Knowledge Building with CoPrA , 2013, Group Decision and Negotiation.

[3]  Bradley L. Kirkman,et al.  Five challenges to virtual team success: Lessons from Sabre, Inc. , 2002 .

[4]  Philip J. Guo,et al.  "Not my bug!" and other reasons for software bug report reassignments , 2011, CSCW.

[5]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[6]  Wil M. P. van der Aalst,et al.  Workflow Patterns , 2004, Distributed and Parallel Databases.

[7]  Akhil Kumar,et al.  Design and management of flexible process variants using templates and rules , 2012, Comput. Ind..

[8]  Hui Liao,et al.  Getting Everyone on Board: The Role of Inspirational Leadership in Geographically Dispersed Teams , 2009, Organ. Sci..

[9]  Kevin Crowston,et al.  Coordination Theory: A Ten-Year Retrospective , 2004, Computer Supported Acitivity Coordination.

[10]  Weidong Xia,et al.  Toward Agile: An Integrated Analysis of Quantitative and Qualitative Field Data , 2010, MIS Q..

[11]  Jesus Boticario,et al.  Application of machine learning techniques to analyse student interactions and improve the collaboration process , 2011, Expert Syst. Appl..

[12]  Akhil Kumar,et al.  Flexible Process Compliance with Semantic Constraints Using Mixed-Integer Programming , 2013, INFORMS J. Comput..

[13]  Jay F. Nunamaker,et al.  Principles for effective virtual teamwork , 2009, CACM.

[14]  D HerbslebJames,et al.  Two case studies of open source software development , 2002 .

[15]  Satish Mehra,et al.  Management leadership and productivity improvement programs , 1999 .

[16]  J.F. Nunamaker,et al.  The impact of process structure on novice, virtual collaborative writing teams , 2005, IEEE Transactions on Professional Communication.

[17]  Ram D. Gopal,et al.  On the Prevention of Fraud and Privacy Exposure in Process Information Flow , 2012, INFORMS J. Comput..

[18]  Kevin Crowston,et al.  Defining Open Source Software Project Success , 2003, ICIS.

[19]  Jiawei Han,et al.  Frequent pattern mining: current status and future directions , 2007, Data Mining and Knowledge Discovery.

[20]  James D. Herbsleb,et al.  Familiarity, Complexity, and Team Performance in Geographically Distributed Software Development , 2007, Organ. Sci..

[21]  N. Keiding,et al.  The role of frailty models and accelerated failure time models in describing heterogeneity due to omitted covariates. , 1997, Statistics in medicine.

[22]  Pierre Jinghong Liang,et al.  Optimal Team Size and Monitoring in Organizations , 2007 .

[23]  Paul Dourish,et al.  Awareness and coordination in shared workspaces , 1992, CSCW '92.

[24]  Ludwig Bstieler,et al.  The Moderating Effect of Environmental Uncertainty on New Product Development and Time Efficiency , 2005 .

[25]  Pär J. Ågerfalk,et al.  Introduction to the Special Issue - Flexible and Distributed Information Systems Development: State of the Art and Research Challenges , 2009, Inf. Syst. Res..

[26]  J. Alberto Espinosa,et al.  Learning from Experience in Software Development: A Multilevel Analysis , 2007, Manag. Sci..

[27]  J. Herbsleb,et al.  Two case studies of open source software development: Apache and Mozilla , 2002, TSEM.

[28]  Elena García Barriocanal,et al.  Empirical findings on team size and productivity in software development , 2012, J. Syst. Softw..

[29]  Umeshwar Dayal,et al.  FreeSpan: frequent pattern-projected sequential pattern mining , 2000, KDD '00.

[30]  Mario Vento,et al.  A (sub)graph isomorphism algorithm for matching large graphs , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Akhil Kumar,et al.  A Study of Quality and Accuracy Trade-offs in Process Mining , 2012, INFORMS J. Comput..

[32]  Allen B. Tucker,et al.  Software Development: An Open Source Approach , 2011 .

[33]  Stephen H. Kan,et al.  Metrics and Models in Software Quality Engineering , 1994, SOEN.

[34]  B. Pentland,et al.  Organizational Routines as Grammars of Action , 1994 .

[35]  Dane Bertram,et al.  Communication, collaboration, and bugs: the social nature of issue tracking in small, collocated teams , 2010, CSCW '10.

[36]  Christine Halverson,et al.  Designing task visualizations to support the coordination of work in software development , 2006, CSCW '06.

[37]  Tung-Ching Lin,et al.  A design to promote group learning in e-learning: Experiences from the field , 2008, Comput. Educ..

[38]  Thomas W. Lee,et al.  The Regression-Analog to Survival Analysis: A Selected Application to Turnover Research , 1993 .

[39]  Kevin Crowston,et al.  Tools for Inventing Organizations: Toward a Handbook of Organizational Processes , 1999 .

[40]  Martin Hemmert,et al.  Increasing Learning and Time Efficiency in Interorganizational New Product Development Teams , 2010 .

[41]  Mark Keil,et al.  Software project risks and their effect on outcomes , 2004, CACM.

[42]  Lawrence B. Holder,et al.  Substructure Discovery Using Minimum Description Length and Background Knowledge , 1993, J. Artif. Intell. Res..

[43]  Xian Liu,et al.  Survival Models on Unobserved Heterogeneity and their Applications in Analyzing Large-scale Survey Data , 2014, Journal of biometrics & biostatistics.

[44]  Michiel van Genuchten,et al.  Why is Software Late? An Empirical Study of Reasons For Delay in Software Development , 1991, IEEE Trans. Software Eng..

[45]  Huimin Zhao,et al.  Incorporating domain knowledge into data mining classifiers: An application in indirect lending , 2008, Decis. Support Syst..

[46]  Patrick Etcheverry,et al.  Pattern-Based Guidelines for Coordination Engineering , 2001, DEXA.

[47]  Hongyan Liu,et al.  A Tree-Based Contrast Set-Mining Approach to Detecting Group Differences , 2014, INFORMS J. Comput..

[48]  Parag C. Pendharkar,et al.  The relationship between software development team size and software development cost , 2009, CACM.

[49]  Chris F. Kemerer,et al.  An Empirical Approach to Studying Software Evolution , 1999, IEEE Trans. Software Eng..

[50]  J. Vaupel,et al.  The impact of heterogeneity in individual frailty on the dynamics of mortality , 1979, Demography.

[51]  Ann Majchrzak,et al.  Knowledge Collaboration in Online Communities , 2011, Organ. Sci..

[52]  Robert D. Galliers,et al.  The creation of 'best practice' software: Myth, reality and ethics , 2006, Inf. Organ..

[53]  Guido Hertel,et al.  Managing virtual teams: A review of current empirical research , 2005 .

[54]  Eean R. Crawford,et al.  A Configural Theory of Team Processes: Accounting for the Structure of Taskwork and Teamwork , 2013 .

[55]  Jay F. Nunamaker,et al.  Collaboration Engineering with ThinkLets to Pursue Sustained Success with Group Support Systems , 2003, J. Manag. Inf. Syst..

[56]  L. Sproull,et al.  Coordinating Expertise in Software Development Teams , 2000 .

[57]  Vijayan Sugumaran,et al.  The role of intelligent agents and data mining in electronic partnership management , 2012, Expert Syst. Appl..

[58]  Gina Venolia,et al.  The secret life of bugs: Going past the errors and omissions in software repositories , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[59]  Lawrence H. Putnam,et al.  A General Empirical Solution to the Macro Software Sizing and Estimating Problem , 1978, IEEE Transactions on Software Engineering.

[60]  Martha S. Feldman,et al.  Dynamics of Organizational Routines: A Generative Model , 2012 .