Machine learning assisted social system analysis: Youth transitions in five south and east Mediterranean countries

While youth transitions to adulthood have been subject to various social studies, it is often the case that statistical tools of choice are limited in terms of sophistication and flexibility. Our study uses information collected as part of the SAHWA project [1] with the primary goal being to verify if machine learning can help rule out inappropriate assumptions and improve transition to adulthood analysis by outlining youth groups, their common characteristics and outlier cases (as well as if they are significant). As data includes numeric, as well as categorical and nominal variables use of common algorithms like K-Means clustering is not possible. It’s also not reasonable to build on Eucledian distances in this mixed space, ruling out other classification methods that rely on it. We split the clustering algorithm selection into: (1) selection of distance calculation function, (2) algorithm and (3) decision on number of groups. A valuable information about transition to adulthood is obtained without imposing restrictive theoretical framework.

[1]  Yizong Cheng,et al.  Mean Shift, Mode Seeking, and Clustering , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  R. Dawes Judgment under uncertainty: The robust beauty of improper linear models in decision making , 1979 .

[3]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[4]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[5]  H. Ralambondrainy,et al.  A conceptual version of the K-means algorithm , 1995, Pattern Recognit. Lett..

[6]  Sara E. Goldstein,et al.  Loneliness, Stress, and Social Support in Young Adulthood: Does the Source of Support Matter? , 2016, Journal of youth and adolescence.

[7]  K. Roberts,et al.  ‘Modernisation theory meets Tunisia’s youth during and since the revolution of 2011’. , 2017 .

[8]  Anoop Nayak Response to Review of Race, Place and Globalization: Youth Cultures in a Changing World , 2023, Children, Youth and Environments.

[9]  Lipika Dey,et al.  A k-mean clustering algorithm for mixed numeric and categorical data , 2007, Data Knowl. Eng..

[10]  J. Coleman Youth: Transition to Adulthood , 1974 .

[11]  K. Roberts,et al.  Leisure and the Life-Cycle Squeeze among Young Adults in North Africa Countries , 2018 .

[12]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[13]  Ashish Sharma,et al.  An Enhanced Density Based Spatial Clustering of Applications with Noise , 2009, 2009 IEEE International Advance Computing Conference.

[14]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[15]  M. J. van der Laan,et al.  A new partitioning around medoids algorithm , 2003 .

[16]  The Influence of Policy Context on Transition Age Foster Youths' Views of Self-Sufficiency , 2017 .

[17]  Catherine A. Sugar,et al.  Finding the Number of Clusters in a Dataset , 2003 .

[18]  Joshua Zhexue Huang,et al.  Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values , 1998, Data Mining and Knowledge Discovery.

[19]  A. R. de Leon,et al.  A generalized Mahalanobis distance for mixed data , 2005 .

[20]  V. Markovska,et al.  Improving alumni network efficiency with machine learning , 2017 .

[21]  Charles A. Bouman,et al.  CLUSTER: An Unsupervised Algorithm for Modeling Gaussian Mixtures , 2014 .

[22]  J. Gower Some distance properties of latent root and vector methods used in multivariate analysis , 1966 .

[23]  J. Gower A General Coefficient of Similarity and Some of Its Properties , 1971 .