A robust data-driven approach identifies four personality types across four large data sets

Understanding human personality has been a focus for philosophers and scientists for millennia1. It is now widely accepted that there are about five major personality domains that describe the personality profile of an individual2,3. In contrast to personality traits, the existence of personality types remains extremely controversial4. Despite the various purported personality types described in the literature, small sample sizes and the lack of reproducibility across data sets and methods have led to inconclusive results about personality types5,6. Here we develop an alternative approach to the identification of personality types, which we apply to four large data sets comprising more than 1.5 million participants. We find robust evidence for at least four distinct personality types, extending and refining previously suggested typologies. We show that these types appear as a small subset of a much more numerous set of spurious solutions in typical clustering approaches, highlighting principal limitations in the blind application of unsupervised machine learning methods to the analysis of big data.Gerlach and colleagues harness the power of big data to address the question of whether individuals can be reliably classified into personality types and how many types exist. The authors identify four different personality types, which surface across data sets.

[1]  Adrian Furnham,et al.  The Wiley-Blackwell handbook of individual differences , 2013 .

[2]  J. Carroll An analytical solution for approximating simple structure in factor analysis , 1953 .

[3]  Konrad P. Körding,et al.  A high-reproducibility and high-accuracy method for automated topic classification , 2014, ArXiv.

[4]  Simine Vazire,et al.  Knowing me, knowing you: the accuracy and unique predictive validity of self-ratings and other-ratings of daily behavior. , 2008, Journal of personality and social psychology.

[5]  Michal Kosinski,et al.  PERSONALITY PROCESSES AND INDIVIDUAL DIFFERENCES Divided We Stand: Three Psychological Regions of the United States and Their Political, Economic, Social, and Health Correlates , 2013 .

[6]  Peter J. Rentfrow,et al.  Regional Personality Differences in Great Britain , 2015, PloS one.

[7]  Martin Krzywinski,et al.  Points of Significance: Clustering , 2017, Nature Methods.

[8]  Chris G. Sibley,et al.  Validation of the four-profile configuration of personality types within the Five-Factor Model , 2017 .

[9]  Paul T. Costa,et al.  Personality Disorders and the Five-Factor Model of Personality , 1994 .

[10]  L. R. Goldberg,et al.  The Big-Five factor structure as an integrative framework: an analysis of Clarke's AVA model. , 1996, Journal of personality assessment.

[11]  Lewis R Goldberg,et al.  Replicability and 40-year predictive power of childhood ARC types. , 2011, Journal of personality and social psychology.

[12]  Phil A. Silva,et al.  Temperamental qualities at age three predict personality traits in young adulthood: longitudinal evidence from a birth cohort. , 1995, Child development.

[13]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[14]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[15]  David R. Anderson,et al.  Model Selection and Multimodel Inference , 2003 .

[16]  Li Lei,et al.  The relationship between personality types and prosocial behavior and aggression in Chinese adolescents , 2016 .

[17]  A. Caspi,et al.  Resilient, overcontrolled, and undercontrolled boys: three replicable personality types. , 1996, Journal of personality and social psychology.

[18]  Antonio Terracciano,et al.  Hierarchical linear modeling analyses of the NEO-PI-R scales in the Baltimore Longitudinal Study of Aging. , 2005, Psychology and aging.

[19]  H. Kaiser The varimax criterion for analytic rotation in factor analysis , 1958 .

[20]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[21]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[22]  Paul T. Costa,et al.  Personality disorders and the five-factor model of personality, 3rd ed. , 2013 .

[23]  Rens Van de Schoot,et al.  Personality types in adolescence: change and stability and links with adjustment and relationships: a five-wave longitudinal study. , 2011, Developmental psychology.

[24]  D L Newman,et al.  Behavioral observations at age 3 years predict adult psychiatric disorders. Longitudinal evidence from a birth cohort. , 1996, Archives of general psychiatry.

[25]  Maike Luhmann,et al.  On the consistency of personality types across adulthood: latent profile analyses in two large-scale panel studies. , 2014, Journal of personality and social psychology.

[26]  Marcus Roth,et al.  Beyond resilients, undercontrollers, and overcontrollers? an extension of personality prototype research , 2006 .

[27]  V. Benet‐Martínez,et al.  Personality and the prediction of consequential outcomes. , 2006, Annual review of psychology.

[28]  S. Gosling,et al.  Should we trust web-based studies? A comparative analysis of six preconceptions about internet questionnaires. , 2004, The American psychologist.

[29]  Thomas A. Widiger,et al.  Five Factor Model of Personality, Personality Disorder , 2015 .

[30]  Antonio Terracciano,et al.  Person‐factors in the California Adult Q‐Set: closing the door on personality trait types? , 2006 .

[31]  Richard W. Robins,et al.  Resilient, Overcontrolled, and Undercontrolled Personality Types: Issues and Controversies , 2010 .

[32]  Valerie J. Lund Issues and Controversies in Sinus Surgery , 1997 .

[33]  Margaret L. Kern,et al.  The Oxford Handbook of the Five Factor Model of Personality , 2016 .

[34]  G. Matthews,et al.  The SAGE handbook of personality theory and assessment. Volume 1, Personality theories and models , 2008 .

[35]  Martin Hirzel,et al.  Machine learning in Python with no strings attached , 2019, MAPL@PLDI.

[36]  J. Horn A rationale and test for the number of factors in factor analysis , 1965, Psychometrika.

[37]  Paul T. Costa,et al.  The replicability and utility of three personality types , 2002 .

[38]  Jens B. Asendorpf,et al.  Carving personality description at its joints: Confirmation of three replicable personality prototypes for both children and adults , 2001 .

[39]  S. Gosling,et al.  Facebook as a research tool for the social sciences: Opportunities, challenges, ethical considerations, and practical guidelines. , 2015, The American psychologist.

[40]  Jack Block,et al.  Lives Through Time , 1983 .

[41]  A. Tellegen,et al.  An alternative "description of personality": the big-five factor structure. , 1990, Journal of personality and social psychology.

[42]  William Revelle,et al.  Individual Differences and Differential Psychology , 2013 .

[43]  M. Ashton,et al.  An Investigation of Personality Types within the HEXACO Personality Framework , 2009 .

[44]  John A. Johnson Measuring thirty facets of the Five Factor Model with a 120-item public domain inventory: Development of the IPIP-NEO-120 , 2014 .

[45]  S. Fortunato,et al.  Resolution limit in community detection , 2006, Proceedings of the National Academy of Sciences.

[46]  M. Eysenck,et al.  Personality and Individual Differences: A Natural Science Approach , 1985 .

[47]  Patrizia Steca,et al.  The utility of a well-known personality typology in studying successful aging: Resilients, undercontrollers,and overcontrollers in old age , 2010 .