A large-scale, in-depth analysis of developers' personalities in the Apache ecosystem

Abstract Context Large-scale distributed projects are typically the results of collective efforts performed by multiple developers with heterogeneous personalities. Objective We aim to find evidence that personalities can explain developers’ behavior in large scale-distributed projects. For example, the propensity to trust others — a critical factor for the success of global software engineering — has been found to influence positively the result of code reviews in distributed projects. Method In this paper, we perform a quantitative analysis of ecosystem-level data from the code commits and email messages contributed by the developers working on the Apache Software Foundation (ASF) projects, as representative of large scale-distributed projects. Results We find that there are three common types of personality profiles among Apache developers, characterized in particular by their level of Agreeableness and Neuroticism. We also confirm that developers’ personality is stable over time. Moreover, personality traits do not vary with their role, membership, and extent of contribution to the projects. We also find evidence that more open developers are more likely to make contributors to Apache projects. Conclusion Overall, our findings reinforce the need for future studies on human factors in software engineering to use psychometric tools to control for differences in developers’ personalities.

[1]  Natalia Juristo Juzgado,et al.  How do personality, team processes and task characteristics relate to job satisfaction and software quality? , 2009, Inf. Softw. Technol..

[2]  Helen Sharp,et al.  Motivation in Software Engineering: A systematic literature review , 2008, Inf. Softw. Technol..

[3]  James D. Herbsleb,et al.  Influence of social and technical factors for evaluating contribution in GitHub , 2014, ICSE.

[4]  Jon Oberlander,et al.  Whose Thumb Is It Anyway? Classifying Author Personality from Weblog Text , 2006, ACL.

[5]  Alessandro Vinciarelli,et al.  A Survey of Personality Computing , 2014, IEEE Transactions on Affective Computing.

[6]  P. Costa,et al.  Revised NEO Personality Inventory (NEO-PI-R) and NEO-Five-Factor Inventory (NEO-FFI) , 1992 .

[7]  Filippo Lanubile,et al.  Establishing Personal Trust-based Connections in Distributed Teams , 2018, Internet Technol. Lett..

[8]  Maurizio Morisio,et al.  TwitPersonality: Computing Personality Traits from Tweets Using Word Embeddings and Supervised Learning , 2018, Inf..

[9]  Andy P. Field,et al.  Discovering Statistics Using SPSS , 2000 .

[10]  Klaus R. Scherer,et al.  Vocal communication of emotion: A review of research paradigms , 2003, Speech Commun..

[11]  Fabio Pianesi,et al.  Workshop on Computational Personality Recognition: Shared Task , 2013, Proceedings of the International AAAI Conference on Web and Social Media.

[12]  Carlos Jensen,et al.  Joining Free/Open Source Software Communities: An Analysis of Newbies' First Interactions on Project Mailing Lists , 2011, 2011 44th Hawaii International Conference on System Sciences.

[13]  Michael Wilson MRC Psycholinguistic Database , 2001 .

[14]  J. Morgan,et al.  Cheap Talk , 2005 .

[15]  Stefanie Schurer,et al.  SEF Working paper : 12 / 2011 September 2011 The stability of big-five personality traits , 2011 .

[16]  Marie-Francine Moens,et al.  Computational personality recognition in social media , 2016, User Modeling and User-Adapted Interaction.

[17]  Alexander Serebrenik,et al.  Code of conduct in open source projects , 2017, 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[18]  Oliver Brdiczka,et al.  Understanding Email Writers: Personality Prediction from Email Messages , 2013, UMAP.

[19]  Nicolas Ducheneaut,et al.  Socialization in an Open Source Software Community: A Socio-Technical Analysis , 2005, Computer Supported Cooperative Work (CSCW).

[20]  Emerson R. Murphy-Hill,et al.  Improving developer participation rates in surveys , 2013, 2013 6th International Workshop on Cooperative and Human Aspects of Software Engineering (CHASE).

[21]  John A. Johnson,et al.  The international personality item pool and the future of public-domain personality measures ☆ , 2006 .

[22]  Lefteris Angelis,et al.  Links between the personalities, views and attitudes of software engineers , 2010, Inf. Softw. Technol..

[23]  Narasimhaiah Gorla,et al.  Who should work with whom?: building effective software project teams , 2004, CACM.

[24]  Erik Cambria,et al.  Deep Learning-Based Document Modeling for Personality Detection from Text , 2017, IEEE Intelligent Systems.

[25]  Kouichi Kishida,et al.  Toward an understanding of the motivation of open source software developers , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[26]  Robert R. McCrae,et al.  NEO-PI-R Data from 36 Cultures , 2002 .

[27]  P. Costa,et al.  Validation of the five-factor model of personality across instruments and observers. , 1987, Journal of personality and social psychology.

[28]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[29]  R. McCrae,et al.  The Geographic Distribution of Big Five Personality Traits , 2007 .

[30]  Vishal Kaushal,et al.  Emerging Trends in Personality Identification Using Online Social Networks—A Literature Survey , 2018, ACM Trans. Knowl. Discov. Data.

[31]  Fabio Q. B. da Silva,et al.  Team building criteria in software projects: A mix-method replicated study , 2013, Inf. Softw. Technol..

[32]  Jennifer Golbeck,et al.  Predicting Personality from Twitter , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[33]  I. B. Myers Manual: A Guide to the Development and Use of the Myers-Briggs Type Indicator , 1985 .

[34]  Marco Aurélio Gerosa,et al.  Social Barriers Faced by Newcomers Placing Their First Contribution in Open Source Software Projects , 2015, CSCW.

[35]  Lefteris Angelis,et al.  Towards individualized software engineering: empirical studies should collect psychometrics , 2008, CHASE.

[37]  Murray R. Barrick,et al.  Validity of observer ratings of the big five personality factors , 1994 .

[38]  David Greathead,et al.  Does personality matter?: an analysis of code-review ability , 2007, CACM.

[39]  B. D. Raad The big five personality factors : the psycholexical approach to personality , 2000 .

[40]  J. B. Murray Review of Research on the Myers-Briggs Type Indicator , 1990 .

[41]  Dirk Hovy,et al.  Personality Traits on Twitter—or—How to Get 1,500 Personality Tests in a Week , 2015, WASSA@EMNLP.

[42]  Emilia Mendes,et al.  Investigating the effects of personality traits on pair programming in a higher education setting through a family of experiments , 2012, Empirical Software Engineering.

[43]  Nicole Novielli,et al.  The challenges of sentiment detection in the social programmer ecosystem , 2015, SSE@SIGSOFT FSE.

[44]  Rafael Prikladnicki,et al.  A Controlled Experiment on the Effects of Machine Translation in Multilingual Requirements Meetings , 2011, 2011 IEEE Sixth International Conference on Global Software Engineering.

[45]  Max Coltheart,et al.  The MRC Psycholinguistic Database , 1981 .

[46]  Fabio Celli Unsupervised Personality Recognition for Social Network Sites , 2012, ICDS 2012.

[47]  R. Ryckman,et al.  Theories of personality , 1989 .

[48]  Gerald M. Weinberg,et al.  Psychology of computer programming , 1971 .

[49]  D. Hinkle,et al.  Applied statistics for the behavioral sciences , 1979 .

[50]  Lefteris Angelis,et al.  Archetypal personalities of software engineers and their work preferences: a new perspective for empirical studies , 2016, Empirical Software Engineering.

[51]  Stefan Wagner,et al.  Links between the personalities, styles and performance in computer programming , 2016, J. Syst. Softw..

[52]  Gregory J. Boyle,et al.  Methods of personality assessment , 2009 .

[53]  Ahmed E. Hassan,et al.  What Can OSS Mailing Lists Tell Us? A Preliminary Psychometric Text Analysis of the Apache Developer Mailing List , 2007, Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007).

[54]  Fabio Pianesi,et al.  The Workshop on Computational Personality Recognition 2014 , 2014, ACM Multimedia.

[55]  Aidan G. C. Wright,et al.  Current Directions in Personality Science and the Potential for Advances through Computing , 2014, IEEE Transactions on Affective Computing.

[56]  Mika Mäntylä,et al.  Natural Language or Not (NLoN) - A Package for Software Engineering Text Analysis Pipeline , 2018, 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR).

[57]  W. Greene,et al.  计量经济分析 = Econometric analysis , 2009 .

[58]  Zahra Karimi,et al.  The influence of personality on computer programming: a summary of a systematic literature review , 2014 .

[59]  Hee-Dong Yang,et al.  An exploratory study on meta skills in software development teams: antecedent cooperation skills and personality for shared mental models , 2008, Eur. J. Inf. Syst..

[60]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[61]  C. Ji An Archetypal Analysis on , 2005 .

[62]  David Lo,et al.  Personality and Project Success: Insights from a Large-Scale Study with Professionals , 2017, 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[63]  Dietmar Pfahl,et al.  Software engineering group work: personality, patterns and performance , 2010, SIGMIS-CPR '10.

[64]  Lefteris Angelis,et al.  Personality, emotional intelligence and work preferences in software engineering: An empirical study , 2014, Inf. Softw. Technol..

[65]  Luiz Fernando Capretz,et al.  Personality Profiles of Software Engineers and Their Software Quality Preferences , 2014, Int. J. Inf. Syst. Soc. Chang..

[66]  Nicole Novielli,et al.  Sentiment Polarity Detection for Software Development , 2017, Empirical Software Engineering.

[67]  R. McCrae,et al.  Toward a Geography of Personality Traits , 2004 .

[68]  Robert Feldt,et al.  Behavioral software engineering: A definition and systematic literature review , 2015, J. Syst. Softw..

[69]  Gerardo Canfora,et al.  Who is going to mentor newcomers in open source projects? , 2012, SIGSOFT FSE.

[70]  D. Funder On the accuracy of personality judgment: a realistic approach. , 1995, Psychological review.

[71]  Alexander Serebrenik,et al.  A Data Set for Social Diversity Studies of GitHub Teams , 2015, 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories.

[72]  Yi Wang,et al.  Language Matters , 2015, 2015 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM).

[73]  Premkumar T. Devanbu,et al.  Open Borders? Immigration in Open Source Projects , 2007, Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007).

[74]  Eleni Stroulia,et al.  On the Personality Traits of StackOverflow Users , 2013, 2013 IEEE International Conference on Software Maintenance.

[75]  Tal Yarkoni Personality in 100,000 Words: A large-scale analysis of personality and word use among bloggers. , 2010, Journal of research in personality.

[76]  G. Āllport,et al.  Trait-names: A psycho-lexical study. , 1936 .

[77]  Nicole Novielli,et al.  A Preliminary Analysis on the Effects of Propensity to Trust in Distributed Software Development , 2017, 2017 IEEE 12th International Conference on Global Software Engineering (ICGSE).

[78]  Alexander Serebrenik,et al.  On negative results when using sentiment analysis tools for software engineering research , 2017, Empirical Software Engineering.

[79]  Jennifer Golbeck,et al.  Predicting personality with social media , 2011, CHI Extended Abstracts.

[80]  Scott Nowson,et al.  A Language-independent and Compositional Model for Personality Trait Recognition from Short Texts , 2016, EACL.

[81]  Margaret L. Kern,et al.  Personality, Gender, and Age in the Language of Social Media: The Open-Vocabulary Approach , 2013, PloS one.

[82]  John C. Grundy,et al.  An Empirical Investigation of Personality Traits of Software Testers , 2015, 2015 IEEE/ACM 8th International Workshop on Cooperative and Human Aspects of Software Engineering.

[83]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[84]  Dag I. K. Sjøberg,et al.  Effects of Personality on Pair Programming , 2010, IEEE Transactions on Software Engineering.

[85]  Luiz Fernando Capretz,et al.  Forty years of research on personality in software engineering: A mapping study , 2015, Comput. Hum. Behav..

[86]  L. Eyde,et al.  Psychological testing and psychological assessment. A review of evidence and issues. , 2001, The American psychologist.

[87]  Jalal Mahmud,et al.  25 Tweets to Know You: A New Model to Predict Personality with Social Media , 2017, ICWSM.

[88]  Danna Zhou,et al.  d. , 1934, Microbial pathogenesis.

[89]  Richard N. Taylor,et al.  Supporting Distributed and Decentralized Projects: Drawing Lessons from the Open Source Community , 2003 .

[90]  Nicole Novielli,et al.  A Gold Standard for Emotion Annotation in Stack Overflow , 2018, 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR).

[91]  Jon Oberlander,et al.  What Are They Blogging About? Personality, Topic and Motivation in Blogs , 2009, ICWSM.

[92]  Tayana Conte,et al.  Assessing the impact of real-time machine translation on multilingual meetings in global software projects , 2015, Empirical Software Engineering.

[93]  R. McCrae,et al.  An introduction to the five-factor model and its applications. , 1992, Journal of personality.

[94]  Daniele Quercia,et al.  Our Twitter Profiles, Our Selves: Predicting Personality with Twitter , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[95]  V. Sharmila,et al.  Using Hashtags to Capture Fine Emotion Categories from Tweets , 2019 .

[96]  Marilyn A. Walker,et al.  Using Linguistic Cues for the Automatic Recognition of Personality in Conversation and Text , 2007, J. Artif. Intell. Res..

[97]  S. Srivastava,et al.  The Big Five Trait taxonomy: History, measurement, and theoretical perspectives. , 1999 .

[98]  Shuib Basri,et al.  A rule-based model for software development team composition: Team leader role with personality types and gender classification , 2016, Inf. Softw. Technol..

[99]  Nachiappan Nagappan,et al.  On the Personality Traits of GitHub Contributors , 2016, 2016 IEEE 27th International Symposium on Software Reliability Engineering (ISSRE).

[100]  Paolo Rosso,et al.  PAN at FIRE: Overview of the PR-SOCO Track on Personality Recognition in SOurce COde , 2016, FIRE.

[101]  L. E. Hicks Conceptual and Empirical Analysis of Some Assumptions of an Explicitly Typological Theory , 1984 .

[102]  Mike Holcombe,et al.  A study into the effects of personality type and methodology on cohesion in software engineering teams , 2007, Behav. Inf. Technol..

[103]  Natalia Juristo Juzgado,et al.  Are team personality and climate related to satisfaction and software quality? Aggregating results from a twice replicated experiment , 2015, Inf. Softw. Technol..

[104]  J. Pennebaker,et al.  Linguistic styles: language use as an individual difference. , 1999, Journal of personality and social psychology.

[105]  Walt Scacchi,et al.  Free software developers as an occupational community: resolving conflicts and fostering collaboration , 2003, GROUP.

[106]  Margaret-Anne D. Storey,et al.  Software Bots , 2017, IEEE Software.

[107]  N. Cliff Dominance statistics: Ordinal analyses to answer ordinal questions. , 1993 .

[108]  Christian Bird,et al.  Beliefs, Practices, and Personalities of Software Engineers: A Survey in a Large Software Company , 2016, 2016 IEEE/ACM Cooperative and Human Aspects of Software Engineering (CHASE).

[109]  Helen M. Edwards,et al.  Who should test whom? , 2007, Commun. ACM.

[110]  GorlaNarasimhaiah,et al.  Who should work with whom , 2004 .

[111]  A. Caspi,et al.  The Power of Personality: The Comparative Validity of Personality Traits, Socioeconomic Status, and Cognitive Ability for Predicting Important Life Outcomes , 2007, Perspectives on psychological science : a journal of the Association for Psychological Science.

[112]  S. Siegel,et al.  Nonparametric Statistics for the Behavioral Sciences , 2022, The SAGE Encyclopedia of Research Design.

[113]  Da CunhaAlessandra Devito,et al.  Does personality matter , 2007 .

[114]  Christian Bird,et al.  Appendix to Beliefs, Practices and Personalities of Software Engineers , 2015 .

[115]  Nicole Novielli,et al.  Towards discovering the role of emotions in stack overflow , 2014, SSE@SIGSOFT FSE.

[116]  David C. Funder On the accuracy of personality judgment: a realistic approach. , 1995 .

[117]  Nicole Novielli,et al.  EmoTxt: A toolkit for emotion recognition from text , 2017, 2017 Seventh International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW).

[118]  R. McCrae Trait psychology and culture: exploring intercultural comparisons. , 2001, Journal of personality.

[119]  Marco Aurélio Gerosa,et al.  Why do newcomers abandon open source software projects? , 2013, 2013 6th International Workshop on Cooperative and Human Aspects of Software Engineering (CHASE).

[120]  Emilia Mendes,et al.  An empirical study of the effects of conscientiousness in pair programming using the five-factor personality model , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[121]  J. Pennebaker,et al.  The Electronically Activated Recorder (EAR): A device for sampling naturalistic daily activities and conversations , 2001, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[122]  杨文秀,et al.  此处“personality”译法探析 , 2000 .

[123]  D. Keirsey,et al.  Please Understand Me: Character and Temperament Types , 1978 .

[124]  J. Pennebaker,et al.  The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods , 2010 .

[125]  Ben Shneiderman,et al.  Software psychology: Human factors in computer and information systems (Winthrop computer systems series) , 1980 .

[126]  P. Costa,et al.  Reinterpreting the Myers-Briggs Type Indicator from the perspective of the five-factor model of personality. , 1989, Journal of personality.

[127]  David F. Redmiles,et al.  Cheap talk, cooperation, and trust in global software engineering , 2016, Empirical Software Engineering.

[128]  T. Graepel,et al.  Private traits and attributes are predictable from digital records of human behavior , 2013, Proceedings of the National Academy of Sciences.

[129]  J. M. Digman PERSONALITY STRUCTURE: EMERGENCE OF THE FIVE-FACTOR MODEL , 1990 .

[130]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[131]  L. R. Goldberg The structure of phenotypic personality traits. , 1993, The American psychologist.

[132]  Filippo Lanubile,et al.  On Developers' Personality in Large-Scale Distributed Projects: The Case of the Apache Ecosystem , 2018, 2018 IEEE/ACM 13th International Conference on Global Software Engineering (ICGSE).

[133]  Christoph Treude,et al.  Mutual assessment in the social programmer ecosystem: an empirical investigation of developer profile aggregators , 2013, CSCW.

[134]  Arie van Deursen,et al.  An exploratory study of the pull-based software development model , 2014, ICSE.