Every method counts: Combining corpus-based and experimental evidence in the study of synonymy

Abstract In this study we explore the concurrent, combined use of three research methods, statistical corpus analysis and two psycholinguistic experiments (a forced-choice and an acceptability rating task), using verbal synonymy in Finnish as a case in point. In addition to supporting conclusions from earlier studies concerning the relationships between corpus-based and experimental data (e. g., Featherston 2005), we show that each method adds to our understanding of the studied phenomenon, in a way which could not be achieved through any single method by itself. Most importantly, whereas relative rareness in a corpus is associated with dispreference in selection, such infrequency does not categorically always entail substantially lower acceptability. Furthermore, we show that forced-choice and acceptability rating tasks pertain to distinct linguistic processes, with category-wise incommensurable scales of measurement, and should therefore be merged with caution, if at all.

[1]  Antonella Sorace,et al.  Gradience in Linguistic Data , 2005 .

[2]  G. Miller,et al.  Contextual correlates of semantic similarity , 1991 .

[3]  Kenneth Ward Church,et al.  Using Statistics in Lexical Analysis , 2003, Lexical Acquisition: Exploiting On-Line Resources to Build a Lexicon.

[4]  Stephan Kepser,et al.  Evidence in linguistics , 2005 .

[5]  Anette Rosenbach,et al.  Aspects of iconicity and economy in the choice between the s-genitive and the of-genitive in English , 2003 .

[6]  Stefan Th. Gries,et al.  Evidence in linguistics: Three approaches to genitives in English , 2002 .

[7]  R. Baayen,et al.  Lexical statistics and lexical processing: semantic density, information complexity, sex, and irregularity in Dutch , 2005 .

[8]  Christiane Fellbaum,et al.  Nouns in WordNet , 1998 .

[9]  Frank Keller,et al.  Gradience in Grammar: Experimental and Computational Aspects of Degrees of Grammaticality , 2001 .

[10]  Stefan Th. Gries,et al.  Ways of trying in Russian: clustering behavioral profiles , 2006, Corpus Linguistics and Linguistic Theory.

[11]  Geoffrey Sampson,et al.  Quantifying the shift towards empirical methods , 2005 .

[12]  S. Gries,et al.  Extending collostructional analysis: A corpus-based perspective on `alternations' , 2004 .

[13]  Douglas Roland,et al.  Verb Sense and Verb Subcategorization Probabilities , 2001 .

[14]  Timo Järvinen,et al.  A non-projective dependency parser , 1997, ANLP.

[15]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[16]  Marga Reis,et al.  Linguistic evidence : empirical, theoretical, and computational perspectives , 2005 .

[17]  Wiltrud Mihatsch Experimental Data vs. Diachronic Typological Data: Two Types of Evidence for Linguistic Relativity , 2004 .

[18]  De Jong,et al.  Morphological families in the mental lexicon , 2002 .

[19]  Jarmo H. Jantunen "Tärkeä seikka" ja "keskeinen kysymys" : Mitä korpuslingvistinen analyysi paljastaa lähisynonyymeistä , 2001 .

[20]  S. Gries,et al.  Converging evidence: Bringing together experimental and corpus data on the association of verbs and constructions , 2005 .

[21]  A Pollatsek,et al.  On the use of counterbalanced designs in cognitive research: a suggestion for a better and more powerful analysis. , 1995, Journal of experimental psychology. Learning, memory, and cognition.

[22]  J. R. Firth,et al.  A Synopsis of Linguistic Theory, 1930-1955 , 1957 .

[23]  G. Leech,et al.  The use of computer corpora in the textual demonstrability of gradience in linguistic categories. , 1994 .

[24]  S. Gries,et al.  1 Converging evidence II : More on the association of verbs and constructions , 2004 .

[25]  Anette Rosenbach,et al.  What counts as evidence in linguistics?: an introduction , 2004 .

[26]  Ted Pedersen,et al.  Fishing for Exactness , 1996, ArXiv.

[27]  Beth Levin,et al.  Building on a corpus: A linguistic and lexicographical look at some near-synonyms* , 1995 .