The dimensionality of discourse

The paragraph spaces of five text corpora, of different genres and intended audiences, in four different languages, all show the same two-scale structure, with the dimension at short distances being lower than at long distances. In all five cases the short-distance dimension is approximately eight. Control simulations with randomly permuted word instances do not exhibit a low dimensional structure. The observed topology places important constraints on the way in which authors construct prose, which may be universal.

[1]  James Theiler,et al.  Estimating fractal dimension , 1990 .

[2]  Peter W. Foltz,et al.  An introduction to latent semantic analysis , 1998 .

[3]  P. Kwantes Using context to build semantics , 2005, Psychonomic bulletin & review.

[4]  T. Goldberg,et al.  Quantifying incoherence in speech: An automated methodology and novel application to schizophrenia , 2007, Schizophrenia Research.

[5]  Thomas L. Griffiths,et al.  Hierarchical Topic Models and the Nested Chinese Restaurant Process , 2003, NIPS.

[6]  Mark Steyvers,et al.  Topics in semantic representation. , 2007, Psychological review.

[7]  William H. Press,et al.  Numerical recipes , 1990 .

[8]  Julien Clinton Sprott,et al.  Improved Correlation Dimension Calculation , 2000, Int. J. Bifurc. Chaos.

[9]  Peter W. Foltz,et al.  The intelligent essay assessor: Applications to educational technology , 1999 .

[10]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[11]  Floris Takens,et al.  On the numerical determination of the dimension of an attractor , 1985 .

[12]  A. Lichtenberg,et al.  Regular and Chaotic Dynamics , 1992 .

[13]  A. Plastino,et al.  Metric character of the quantum Jensen-Shannon divergence , 2008, 0801.1586.

[14]  Peter J. Bickel,et al.  Maximum Likelihood Estimation of Intrinsic Dimension , 2004, NIPS.

[15]  James Theiler,et al.  Lacunarity in a best estimator of fractal dimension , 1988 .

[16]  Biman Das,et al.  Calculating the dimension of attractors from small data sets , 1986 .

[17]  Danielle S. McNamara,et al.  Handbook of latent semantic analysis , 2007 .

[18]  F. Takens Detecting strange attractors in turbulence , 1981 .

[19]  F. Takens,et al.  Dynamical systems and bifurcations , 1985 .

[20]  Susan T. Dumais,et al.  Improving the retrieval of information from external sources , 1991 .

[21]  P. Grassberger,et al.  Measuring the Strangeness of Strange Attractors , 1983 .

[22]  Patrick F. Reidy An Introduction to Latent Semantic Analysis , 2009 .

[23]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[24]  S. Shankar Sastry Dynamical Systems and Bifurcations , 1999 .

[25]  Dean Prichard,et al.  Is the AE index the result of nonlinear dynamics , 1993 .

[26]  A. N. Sharkovskiĭ Dynamic systems and turbulence , 1989 .

[27]  Dominik Endres,et al.  A new metric for probability distributions , 2003, IEEE Transactions on Information Theory.

[28]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[29]  S. Ellner Estimating attractor dimensions from limited data: A new method, with error estimates , 1988 .

[30]  Grace S. Chiu,et al.  Bent-Cable Regression Theory and Applications , 2006 .