An analysis of design process and performance in distributed data science teams

Often, it is assumed that teams are better at solving problems than individuals working independently. However, recent work in engineering, design and psychology contradicts this assumption. This study aims to examine the behavior of teams engaged in data science competitions. Crowdsourced competitions have seen increased use for software development and data science, and platforms often encourage teamwork between participants.,We specifically examine the teams participating in data science competitions hosted by Kaggle. We analyze the data provided by Kaggle to compare the effect of team size and interaction frequency on team performance. We also contextualize these results through a semantic analysis.,This work demonstrates that groups of individuals working independently may outperform interacting teams on average, but that small, interacting teams are more likely to win competitions. The semantic analysis revealed differences in forum participation, verb usage and pronoun usage when comparing top- and bottom-performing teams.,These results reveal a perplexing tension that must be explored further: true teams may experience better performance with higher cohesion, but nominal teams may perform even better on average with essentially no cohesion. Limitations of this research include not factoring in team member experience level and reliance on extant data.,These results are potentially of use to designers of crowdsourced data science competitions as well as managers and contributors to distributed software development projects.

[1]  Verlin B. Hinsz,et al.  The emerging conceptualization of groups as information processors. , 1997, Psychological bulletin.

[2]  Kathryn Weed Jablokow,et al.  Project Team Dynamics and Cognitive Style , 2002 .

[3]  Eduardo Salas,et al.  Group Size, Leadership Behavior, and Subordinate Satisfaction , 1989 .

[4]  Armand Hatchuel,et al.  C-K design theory: an advanced formulation , 2008 .

[5]  Allen C. Amason,et al.  The Effects of Top Management Team Size and interaction Norms on Cognitive and Affective Conflict , 1997 .

[6]  Kipling D. Williams,et al.  Interpersonal Relations and Group Processes Social Loafing: a Meta-analytic Review and Theoretical Integration , 2022 .

[7]  Levent Burak Kara,et al.  Semantic shape editing using deformation handles , 2015, ACM Trans. Graph..

[8]  D. Meyer,et al.  Supporting Online Material Materials and Methods Som Text Figs. S1 to S6 References Evidence for a Collective Intelligence Factor in the Performance of Human Groups , 2022 .

[9]  Yi Ren,et al.  When Crowdsourcing Fails: A Study of Expertise on Crowdsourced Design Evaluation , 2015 .

[10]  J. Pennebaker,et al.  The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods , 2010 .

[11]  Daniel B Wright,et al.  Calculating nominal group statistics in collaboration studies , 2007, Behavior research methods.

[12]  R. Gooding,et al.  A Meta-Analytic Review of the Relationship between Size and Performance: The Productivity and Efficiency of Organizations and Their Subunits. , 1985 .

[13]  Fabio Q. B. da Silva,et al.  Challenges and solutions in distributed software development project management: A systematic literature review , 2010, 2010 5th IEEE International Conference on Global Software Engineering.

[14]  Yi Ren,et al.  EcoRacer: Game-based optimal electric vehicle design and driver control using human players , 2015, DAC 2015.

[15]  E. Gumbel,et al.  Les valeurs extrêmes des distributions statistiques , 1935 .

[16]  Paul R. Steffens,et al.  Birds of a feather get lost together: new venture team composition and performance , 2012 .

[17]  Kemper Lewis,et al.  Observing network characteristics in mass collaboration design projects , 2018, Design Science.

[18]  J. Pinto,et al.  Project team communication and cross-functional cooperation in new program development , 1990 .

[19]  Luiz Fernando Capretz,et al.  Improving Effort Estimation by Voting Software Estimation Models , 2009, Adv. Softw. Eng..

[20]  Mark S. Granovetter T H E S T R E N G T H O F WEAK TIES: A NETWORK THEORY REVISITED , 1983 .

[21]  G. Stewart A Meta-Analytic Review of Relationships Between Team Design Features and Team Performance , 2006 .

[22]  Brian S. Butler,et al.  Team Cognition: Development and Evolution in Software Project Teams , 2007, J. Manag. Inf. Syst..

[23]  P. Gloor,et al.  Measuring creative performance of teams through dynamic semantic social network analysis , 2013 .

[24]  Dahui Li,et al.  Task Design, Motivation, and Participation in Crowdsourcing Contests , 2011, Int. J. Electron. Commer..

[25]  L. Sproull,et al.  Coordinating Expertise in Software Development Teams , 2000 .

[26]  Alexander Serebrenik,et al.  Gender, Representation and Online Participation: A Quantitative Study of StackOverflow , 2012, SocialInformatics.

[27]  David F. Sally Conversation and Cooperation in Social Dilemmas , 1995 .

[28]  J. Carpenter May the best analyst win. , 2011, Science.

[29]  Timothy T. Baldwin,et al.  TEAM‐BASED EMPLOYEE INVOLVEMENT PROGRAMS: EFFECTS OF DESIGN AND ADMINISTRATION , 2006 .

[30]  Mario Piattini,et al.  Challenges and Improvements in Distributed Software Development: A Systematic Review , 2009, Adv. Softw. Eng..

[31]  Sara A. McComb,et al.  Examining a curvilinear relationship between communication frequency and team performance in cross-functional project teams , 2003, IEEE Trans. Engineering Management.

[32]  Christopher McComb,et al.  Optimizing Design Teams Based on Problem Properties: Computational Team Simulations and an Applied Empirical Test , 2017 .

[33]  Hao Wu,et al.  An evaluation methodology for crowdsourced design , 2015, Adv. Eng. Informatics.

[34]  Christopher McComb,et al.  Lifting the Veil: Drawing insights about design teams from a cognitively-inspired computational model , 2015 .

[35]  P. Laplante,et al.  A software engineering team research mapping study , 2018, Team Performance Management: An International Journal.

[36]  Philip Yetton,et al.  Individual versus group problem solving: An empirical test of a best-member strategy , 1982 .

[37]  C. Lam The Role of Communication and Cohesion in Reducing Social Loafing in Group Projects , 2015 .

[38]  Jesse J. Chandler,et al.  Inside the Turk , 2014 .

[39]  G. Salomon,et al.  When teams do not function the way they ought to , 1989 .