Converging evidence from corpus and experimental data to capture idiomaticity

Abstract It is a by now established fact that idiomaticity cannot be equated with non-compositionality alone, but is a complex concept that is also associated with various aspects of formal flexibility. This raises the question to what extent speakers call up these different factors when judging the overall idiomaticity of a phrase. In the present paper, experimental and corpus-linguistic methodology are combined to address this question. For a total of 39 V NP-idioms of the kind make a point or take the plunge, comprising more than 13,000 tokens obtained from the British National Corpus, their compositionality, syntactic, lexico-syntactic, and morphological flexibility were assessed corpus-linguistically. The corpus-based results thereby obtained were then correlated with native speakers' overall idiomaticity judgments in a multiple regression analysis to determine each factor's impact on the overall judgments. The results indicate that speakers indeed rely on multiple factors simultaneously, with lexico-syntactic and morphological factors being even more important than compositionality, and verb-related being more important than NP-related information. Overall, the results back up the theoretical concept of a collocation-idiom continuum, and demonstrate how various, and sometimes competing, motivations determine a phrase's position on this continuum.

[1]  Cristina Cacciari,et al.  Idioms: Processing, Structure, and Interpretation , 1993 .

[2]  A. Goldberg Constructions at Work: The Nature of Generalization in Language , 2006 .

[3]  Timothy Baldwin,et al.  A Statistical Approach to the Semantics of Verb-Particles , 2003, ACL 2003.

[4]  L. Cronbach Coefficient alpha and the internal structure of tests , 1951 .

[5]  R. Gibbs,et al.  Psycholinguistic studies on the syntactic behavior of idioms , 1989, Cognitive Psychology.

[6]  Marga Reis,et al.  Linguistic evidence : empirical, theoretical, and computational perspectives , 2005 .

[7]  Stefanie Wulff,et al.  Rethinking Idiomaticity: A Usage-based Approach , 2009 .

[8]  John Carroll,et al.  Detecting a Continuum of Compositionality in Phrasal Verbs , 2003, ACL 2003.

[9]  C M Connine,et al.  Comprehension of idiomatic expressions: effects of predictability and literality. , 1994, Journal of experimental psychology. Learning, memory, and cognition.

[10]  John Sinclair,et al.  Collins COBUILD dictionary of idioms , 1995 .

[11]  R. Gibbs,et al.  Syntactic frozenness in processing and remembering idioms , 1985, Cognition.

[12]  Colin J. Bannard,et al.  Learning about the meaning of verb-particle constructions from corpora , 2005, Comput. Speech Lang..

[13]  S. Gries,et al.  Converging evidence: Bringing together experimental and corpus data on the association of verbs and constructions , 2005 .

[14]  Daniel Jurafsky,et al.  Is Knowledge-Free Induction of Multiword Unit Dictionary Headwords a Solved Problem? , 2001, EMNLP.

[15]  R. Langacker Foundations of Cognitive Grammar: Volume I: Theoretical Prerequisites , 1987 .

[16]  Chitra Fernando,et al.  Idioms and idiomaticity , 1996 .

[17]  L. Barsalou Cognitive Psychology: An Overview for Cognitive Scientists , 1992 .

[18]  A. Sorace,et al.  MAGNITUDE ESTIMATION OF LINGUISTIC ACCEPTABILITY , 1996 .

[19]  Antti Arppe,et al.  Every method counts: Combining corpus-based and experimental evidence in the study of synonymy , 2007 .

[20]  R. Gibbs,et al.  Speakers' assumptions about the lexical flexibility of idioms , 1989, Memory & cognition.