Uncovering the differences in linguistic network dynamics of book and social media texts

Complex network studies span a large variety of applications including linguistic networks. To investigate the differences in book and social media texts in terms of linguistic typology, we constructed both sequential and sentence collocation networks of book, Facebook and Twitter texts with undirected and weighted edges. The comparisons are performed using the basic parameters like average degree, modularity, average clustering coefficient, average path length, diameter, average link weight etc. We also presented the distribution graphs for node degrees, edge weights and maximum degree differences of the pairing nodes. The degree difference occurrences are furtherly detailed with the grayscale percentile plots with respect to the edge weights. We linked the network analysis with linguistic aspects like word and sentence length distributions. We concluded that linguistic typology demonstrates a formal usage in book that slightly deviates to informal in Twitter. Facebook interpolates between these media by the means of network parameters, while the informality of Twitter is mostly influenced by the character limitations.

[1]  R. Köhler Linguistic complex networks as a young field of quantitative linguistics: comment on "approaching human language with complex networks" by J. Cong and H. Liu. , 2014, Physics of life reviews.

[2]  Luciano da Fontoura Costa,et al.  Modeling worldwide highway networks , 2009 .

[3]  Stanley F. Chen,et al.  Aligning Sentences in Bilingual Corpora Using Lexical Information , 1993, ACL.

[4]  Alexander Mehler,et al.  Automatic Language Classification by means of Syntactic Dependency Networks , 2011, J. Quant. Linguistics.

[5]  Albert-László Barabási,et al.  Universality in network dynamics , 2013, Nature Physics.

[6]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[7]  Animesh Mukherjee,et al.  The Structure and Dynamics of Linguistic Networks , 2009 .

[8]  Yuyang Gao,et al.  Comparison of directed and weighted co-occurrence networks of six languages , 2014 .

[9]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[10]  Reginald Smith,et al.  Distinct word length frequencies: distributions and symbol entropies , 2012, Glottometrics.

[11]  Haitao Liu,et al.  A diachronic study of Chinese word length distribution , 2014, Glottometrics.

[12]  Jinyun Ke,et al.  Complex networks and human language , 2007, ArXiv.

[13]  Haitao Liu,et al.  Can syntactic networks indicate morphological complexity of a language , 2011 .

[14]  İlker Türker,et al.  Scientific collaboration network of Turkey , 2013 .

[15]  Haitao Liu,et al.  Language clustering with word co-occurrence networks based on parallel texts , 2013 .

[16]  Long Sheng,et al.  English and Chinese languages as weighted complex networks , 2009 .

[17]  M. Newman,et al.  Mean-field solution of the small-world network model. , 1999, Physical review letters.

[18]  Sharon L. Milgram,et al.  The Small World Problem , 1967 .

[19]  Haitao Liu,et al.  The effects of sentence length on dependency distance, dependency direction and the implications–Based on a parallel English–Chinese dependency treebank , 2015 .

[20]  Jae Jung Song,et al.  Linguistic Typology: Morphology and Syntax , 2000 .

[21]  Albert-Lszl Barabsi,et al.  Network Science , 2016, Encyclopedia of Big Data.

[22]  George Kingsley Zipf,et al.  Human behavior and the principle of least effort , 1949 .

[23]  Lars Backstrom,et al.  The Anatomy of the Facebook Social Graph , 2011, ArXiv.

[24]  J. Kurths,et al.  Complex network approach for recurrence analysis of time series , 2009, 0907.3368.

[25]  Lars Erik Zeige From Saussure to sociology and back to linguistics: Niklas Luhmann’s reception of signifiant/signifié and langue/parole as the basis for a model of language change , 2015 .

[26]  W. Bruce Croft Typology and Universals , 1990 .

[27]  M. E. J. Newman,et al.  Power laws, Pareto distributions and Zipf's law , 2005 .

[28]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[29]  M. Newman Power laws, Pareto distributions and Zipf's law , 2005 .

[30]  KE Jinyun,et al.  Complex networks and human language , 2007, cs/0701135.

[31]  İlker Türker,et al.  Patterns of collaboration in four scientific disciplines of the Turkish collaboration network , 2014 .

[32]  Jure Leskovec,et al.  Planetary-scale views on a large instant-messaging network , 2008, WWW.

[33]  Haitao Liu,et al.  Language clusters based on linguistic complex networks , 2010 .

[34]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[35]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[36]  H E Stanley,et al.  Classes of small-world networks. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[37]  Haitao Liu,et al.  How Does Word Length Evolve in Written Chinese? , 2015, PloS one.

[38]  Luc Steels,et al.  Language as a Complex Adaptive System , 2000, PPSN.

[39]  Hai-Feng Huo,et al.  Impact of media coverage on the drinking dynamics in the scale-free network , 2016, SpringerPlus.

[40]  Alessandro E. P. Villa,et al.  The topology of the directed clique complex as a network invariant , 2015, SpringerPlus.

[41]  Matjaz Perc,et al.  Growth and structure of Slovenia's scientific collaboration network , 2010, J. Informetrics.

[42]  Peter Nijkamp,et al.  Accessibility of Cities in the Digital Economy , 2004, cond-mat/0412004.