Complexity of Categorical Time Series

Categorical time series, covering comparable time spans, are often quite different in a number of aspects: the number of distinct states, the number of transitions, and the distribution of durations over states. Each of these aspects contributes to an aggregate property of such series that is called complexity. Among sociologists and demographers, complexity is believed to systematically differ between groups as a result of social structure or social change. Such groups differ in, for example, age, gender, or status. The author proposes quantifications of complexity, based upon the number of distinct subsequences in combination with, in case of associated durations, the variance of these durations. A simple algorithm to compute these coefficients is provided and some of the statistical properties of the coefficients are investigated in an application to family formation histories of young American females.

[1]  Svante Janson,et al.  On the average sequence complexity , 2004, Data Compression Conference, 2004. Proceedings. DCC 2004.

[2]  C. Gini Measurement of Inequality of Incomes , 1921 .

[3]  Patrice Séébold,et al.  Proof of a conjecture on word complexity , 2001 .

[4]  J. Arnett Emerging adulthood. A theory of development from the late teens through the twenties. , 2000, The American psychologist.

[5]  Marlis Buchmann,et al.  The Script of Life in Modern Society: Entry into Adulthood in a Changing World , 1989 .

[6]  Daniel J. Katz,et al.  Proof of a Conjecture of , 2011 .

[7]  Aart C. Liefbroer,et al.  De-standardization of Family-Life Trajectories of Young Adults: A Cross-National Comparison Using Sequence Analysis , 2007 .

[8]  A. Kolmogorov Three approaches to the quantitative definition of information , 1968 .

[9]  Jack P. Gibbs,et al.  The Division of Labor: Conceptualization and Related Measures , 1975 .

[10]  Zhiwei Lin,et al.  A Novel Algorithm for Counting All Common Subsequences , 2007 .

[11]  W. Bossert,et al.  The Measurement of Diversity , 2001 .

[12]  D. H. Freeman Statistical Decomposition Analysis , 1974 .

[13]  Abraham D. Flaxman,et al.  Strings with Maximally Many Distinct Subsequences and Substrings , 2004, Electron. J. Comb..

[14]  A. R. Wilcox,et al.  INDICES OF QUALITATIVE VARIATION. , 1967 .

[15]  P. Uhlenberg A study of cohort life cycles: Cohorts of native born Massachusetts women, 1830-1920. , 1969, Population studies.

[16]  A. Atkinson On the measurement of inequality , 1970 .

[17]  A. Abbott,et al.  Sequence Analysis and Optimal Matching Methods in Sociology , 2000 .

[18]  Sven Rahmann,et al.  Algorithms for subsequence combinatorics , 2008, Theor. Comput. Sci..

[19]  Antal Iványi,et al.  On the d-complexity of words , 1987 .

[20]  Ron Lesthaeghe,et al.  [The second demographic transition in Western countries: an interpretation] , 1992 .

[21]  Andrew Abbott,et al.  A Comment on “Measuring the Agreement between Sequences” , 1995 .

[22]  Eötvös Loránd Tudományegyetem Annales Universitatis Scientiarum Budapestinensis de Rolando Eötvös Nominatae. Sectio geologica , 1957 .

[23]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[24]  V. Nikiforov,et al.  PROOF OF A CONJECTURE OF , 2010 .

[25]  Michael Anyadike-Danes,et al.  Predicting successful and unsuccessful transitions from school to work by using sequence methods , 2002 .

[26]  Cees H. Elzinga,et al.  Combinatorial Representations of Token Sequences , 2005, J. Classif..

[27]  M. Mills Stability and Change: The Structuration of Partnership Histories in Canada, the Netherlands, and the Russian Federation , 2004 .

[28]  M. Shanahan,et al.  Pathways to Adulthood in Changing Societies: Variability and Mechanisms in Life Course Perspective , 2000 .

[29]  Anthony C. Davison,et al.  Bootstrap Methods and Their Application , 1998 .

[30]  Aldo de Luca,et al.  On the Combinatorics of Finite Words , 1999, Theor. Comput. Sci..

[31]  P. J. Chase Subsequence numbers and logarithmic concavity , 1976, Discret. Math..

[32]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[33]  Thomas M. Cover,et al.  Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing) , 2006 .

[34]  Zoltán Kása,et al.  On the d-complexity of strings , 2010, ArXiv.

[35]  Patrick Festy,et al.  An evaluation of the fertility and family surveys project , 2002 .

[36]  Aart C. Liefbroer,et al.  Standardization of pathways to adulthood? an analysis of Dutch cohorts born between 1850 and 1900 , 2010, Demography.

[37]  P. Uhlenberg Cohort Variations in Family Life Cycle Experiences of U. S. Females , 1974 .

[38]  Sébastien Ferenczi,et al.  Complexity for Finite Factors of Infinite Sequences , 1999, Theor. Comput. Sci..