WeightedCluster Library Manual A practical guide to creating typologies of trajectories in the social sciences with R

This manual has a twofold aim: to present the WeightedCluster library and offer a step-by-step guide to creating typologies of sequences for the social sciences. In particular, this library makes it possible to represent graphically the results of a hierarchical cluster analysis, to group identical sequences in order to analyse a larger number of sequences, to compute a set of measures of partition quality and also an optimized PAM (Partitioning Around Medoids) algorithm taking account of weightings. The library also offers procedures to facilitate the choice of a particular clustering solution and to choose the optimal number of groups. In addition to the methods, we also discuss the building of typologies of sequences in the social sciences and the assumptions underlying this operation. In particular we clarify the place that should be given to the creation of typologies in the analysis of sequences. We thus show that these methods offer an important descriptive point of view on sequences by bringing to light recurrent patterns. However, they should not be used in a confirmatory analysis, since they can point to misleading conclusions.

[1]  Laurent Lesnard,et al.  Setting Cost in Optimal Matching to Uncover Contemporaneous Socio-Temporal Patterns , 2010 .

[2]  Gilbert Ritschard,et al.  Discrepancy Analysis of State Sequences , 2011 .

[3]  Beatriz de la Iglesia,et al.  Clustering Rules: A Comparison of Partitioning and Hierarchical Clustering Algorithms , 2006, J. Math. Model. Algorithms.

[4]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[5]  M. Cugmas,et al.  On comparing partitions , 2015 .

[6]  Matissa N. Hollister,et al.  Is Optimal Matching Suboptimal? , 2009 .

[7]  A. Abbott Sequence analysis: new methods for old ideas , 1995 .

[8]  G. W. Milligan,et al.  A Comparison of Two Approaches to Beta-Flexible Clustering. , 1992, Multivariate behavioral research.

[9]  Sergio Gómez,et al.  Solving Non-Uniqueness in Agglomerative Hierarchical Clustering Using Multidendrograms , 2006, J. Classif..

[10]  P. Legendre,et al.  vegan : Community Ecology Package. R package version 1.8-5 , 2007 .

[11]  Gilbert Ritschard,et al.  Analyzing and Visualizing State Sequences in R with TraMineR , 2011 .

[12]  G. W. Milligan,et al.  Methodology Review: Clustering Methods , 1987 .

[13]  A. Abbott,et al.  Measuring Resemblance in Sequence Data: An Optimal Matching Analysis of Musicians' Careers , 1990, American Journal of Sociology.

[14]  Daniel Müllner Fast Hierarchical Clustering Routines for R and Python , 2015 .

[15]  André Hardy,et al.  An examination of procedures for determining the number of clusters in a data set , 1994 .

[16]  Jake,et al.  Children of the Great Depression , 2009 .

[17]  L. Hubert,et al.  A general statistical framework for assessing categorical clustering in free recall. , 1976 .

[18]  Joel Levine But What Have You Done for Us Lately? , 2000 .

[19]  Mia Hubert,et al.  Clustering in an object-oriented environment , 1997 .

[20]  Michael Anyadike-Danes,et al.  Predicting successful and unsuccessful transitions from school to work by using sequence methods , 2002 .

[21]  Eric D. Widmer,et al.  Entre standardisation, individualisation et sexuation: une analyse des trajectoires personnelles en Suisse. , 2003 .

[22]  G. W. Milligan,et al.  A Study of the Beta-Flexible Clustering Method. , 1989, Multivariate behavioral research.

[23]  Patrick Rousset,et al.  Classifying Qualitative Time Series with SOM: The Typology of Career Paths in France , 2007, IWANN.

[24]  T. Caliński,et al.  A dendrite method for cluster analysis , 1974 .

[25]  Emden R. Gansner,et al.  An open graph visualization system and its applications to software engineering , 2000, Softw. Pract. Exp..

[26]  Laurent Lesnard,et al.  Optimal Matching and Social Sciences , 2006 .

[27]  Christian Hennig,et al.  Comparing latent class and dissimilarity based clustering for mixed type variables with application to social stratification , 2010 .

[28]  Silke Aisenbrey,et al.  New Life for Old Ideas: The "Second Wave" of Sequence Analysis Bringing the "Course" Back Into the Life Course , 2010 .