The effects of genre on dependency distance and dependency direction

Abstract Dependency distance, the distance between words linked in a syntactic dependency, has been a key measure of interest in cross-linguistic corpus work, because it is hypothesized to reflect working memory demands during sentence processing. However, previous work has not systematically investigated the effects of text genre on dependency distance in English. We might expect spoken language to have shorter dependencies and informative technical language to have longer dependencies than do fiction and imaginative language. The current study uses quantitative methods to explore the distribution of dependency distance in English across genres controlled for sentence length. Ten genres of English from the British National Corpus provided the data source. The results show that 1) The distributions of dependency distances across all sentence lengths and genres follow the same parametric distributions. 2) Genre affects adjacent dependency rate significantly, but its effect is very small. 3) Sentence length and genre effect dependency distance significantly, but the effect is small. We find shorter dependencies in written-to-be-spoken texts, and longer dependencies in the imaginative genre than those in the informative genres. 4) Genre effects dependency direction significantly, but again the effect is small. Overall the results suggest that the effect of genre on these dependency measures is small, suggesting that dependency distance is primarily determined by universal cognitive factors rather than genre-specific stylistic factors.

[1]  Gabriel Altmann,et al.  Probability Distributions of Syntactic Units and Properties* , 2000, J. Quant. Linguistics.

[2]  Lucien Tesnière Éléments de syntaxe structurale , 1959 .

[3]  Edward Gibson,et al.  Direct Evidence of Memory Retrieval as a Source of Difficulty in Non-Local Dependencies in Language , 2013, Cogn. Sci..

[4]  Timothy Jay The psychology of language , 2002 .

[5]  Eva Maria Eppler,et al.  The syntax of German-English code-switching , 2005 .

[6]  J. Hawkins Efficiency and complexity in grammars , 2004 .

[7]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[8]  G. Altmann,et al.  Thesaurus of univariate discrete probability distributions , 1999 .

[9]  Haitao Liu,et al.  Probability distribution of dependency distance , 2007, Glottometrics.

[10]  John A. Hawkins,et al.  A Performance Theory of Order and Constituency , 1995 .

[11]  R. F. Cancho Euclidean distance between syntactically linked words. , 2004 .

[12]  Richard Hudson,et al.  An Introduction to Word Grammar , 2010 .

[13]  Nick Chater,et al.  The Now-or-Never bottleneck: A fundamental constraint on language , 2015, Behavioral and Brain Sciences.

[14]  M. Dryer The Greenbergian word order correlations , 1992 .

[15]  E. Gibson Linguistic complexity: locality of syntactic dependencies , 1998, Cognition.

[16]  Richard Futrell,et al.  Large-scale evidence of dependency length minimization in 37 languages , 2015, Proceedings of the National Academy of Sciences.

[17]  Evelina Fedorenko,et al.  The syntactic complexity of Russian relative clauses , 2012, Journal of memory and language.

[18]  Haitao Liu,et al.  Using a Chinese treebank to measure dependency distance , 2009 .

[19]  Edward Gibson,et al.  Consequences of the Serial Nature of Linguistic Input for Sentenial Complexity , 2005, Cogn. Sci..

[20]  Hans Jürgen Heringer,et al.  Syntax : Fragen, Lösungen, Alternativen , 1980 .

[21]  Joseph H. Greenberg,et al.  Some Universals of Grammar with Particular Reference to the Order of Meaningful Elements , 1990, On Language.

[22]  Victor H. Yngve,et al.  A model and an hypothesis for language structure , 1960 .

[23]  Haitao Liu,et al.  Dependency Distance as a Metric of Language Comprehension Difficulty , 2008 .

[24]  Haitao Liu,et al.  The risks of mixing dependency lengths from sequences of different length , 2013, ArXiv.

[25]  Michael Collins,et al.  A New Statistical Parser Based on Bigram Lexical Dependencies , 1996, ACL.

[26]  Haitao Liu,et al.  Dependency direction as a means of word-order typology: A method based on dependency treebanks , 2010 .

[27]  Daniel Gildea,et al.  Do Grammars Minimize Dependency Length? , 2010, Cogn. Sci..

[28]  Haitao Liu,et al.  The effects of sentence length on dependency distance, dependency direction and the implications–Based on a parallel English–Chinese dependency treebank , 2015 .

[29]  E. Gibson The dependency locality theory: A distance-based theory of linguistic complexity. , 2000 .

[30]  Stefan Thomas Gries,et al.  Statistics for linguistics with R: A practical introduction (review) , 2012 .

[31]  Ramon Ferrer-i-Cancho,et al.  Hubiness, length, crossings and their relationships in dependency trees , 2013, ArXiv.

[32]  David Y. W. Lee,et al.  Genres, Registers, Text Types, Domains and Styles: Clarifying the Concepts and Navigating a Path through the BNC Jungle , 2001 .

[33]  Christopher D. Manning,et al.  Stanford typed dependencies manual , 2010 .

[34]  Wenwen Li,et al.  Chinese Syntactic and Typological Properties Based on Dependency Syntactic Treebanks , 2009 .

[35]  Richard Hudson,et al.  The psychological reality of syntactic dependency relations , 2003 .

[36]  Ramon Ferrer-i-Cancho,et al.  Non-crossing dependencies: least effort, not grammar , 2014, ArXiv.

[37]  Haitao Liu,et al.  Can chunking reduce syntactic complexity of natural languages? , 2016, Complex..

[38]  Samuel R. Bowman,et al.  More Constructions, More Genres: Extending Stanford Dependencies , 2013, DepLing.

[39]  Haitao Liu,et al.  Quantitative typological analysis of Romance languages , 2012 .

[40]  David Temperley,et al.  Minimization of dependency length in written English , 2007, Cognition.