Dedicated Language Resources for Interdisciplinary Research on Multiword Expressions: Best Thing since Sliced Bread

Multiword expressions such as idioms (beat about the bush), collocations (plastic surgery) and lexical bundles (in the middle of) are challenging for disciplines like Natural Language Processing (NLP), psycholinguistics and second language acquisition, , due to their more or less fixed character. Idiomatic expressions are especially problematic, because they convey a figurative meaning that cannot always be inferred from the literal meanings of the component words. Researchers acknowledge that important properties that characterize idioms such as frequency of exposure, familiarity, transparency, and imageability, should be taken into account in research, but these are typically properties that rely on subjective judgments. This is probably one of the reasons why many studies that investigated idiomatic expressions collected limited information about idiom properties for very small numbers of idioms only. In this paper we report on cross-boundary work aimed at developing a set of tools and language resources that are considered crucial for this kind of multifaceted research. We discuss the results of our research and suggest possible avenues for future research

[1]  Tim van de Cruys,et al.  Semantics-based Multiword Expression Extraction , 2007 .

[2]  G. Carrol,et al.  Getting your wires crossed: Evidence for fast processing of L1 idioms in an L2* , 2014, Bilingualism: Language and Cognition.

[3]  P. Tabossi,et al.  The comprehension of idioms. , 1988 .

[4]  Adam Przepiórkowski,et al.  PARSEME – PARSing and Multiword Expressions within a European multilingual network , 2015 .

[5]  Edward Holsinger,et al.  Representing Idioms: Syntactic and Contextual Effects on Idiom Processing , 2013, Language and speech.

[6]  Jan Odijk,et al.  Identification and Lexical Representation of Multiword Expressions , 2013, Essential Speech and Language Technology for Dutch.

[7]  T. Dijkstra,et al.  Normative Data of Dutch Idiomatic Expressions: Subjective Judgments You Can Bank on , 2019, Front. Psychol..

[8]  N. Ellis,et al.  Formulaic Language in Native and Second Language Speakers: Psycholinguistics, Corpus Linguistics, and TESOL , 2008 .

[9]  Kathy Conklin,et al.  Adding more fuel to the fire: An eye-tracking study of idiom processing by native and non-native speakers , 2011 .

[10]  F. A. Stoett Nederlandsche spreekwoorden : spreekwijzen, uitdrukkingen en gezegden , 1905 .

[11]  Jan H. Hulstijn,et al.  SECOND LANGUAGE IDIOM LEARNING IN A PAIRED-ASSOCIATE PARADIGM: Effects of Direction of Learning, Direction of Testing, Idiom Imageability, and Idiom Transparency , 2007, Studies in Second Language Acquisition.

[12]  S. Irujo Don't Put Your Leg in Your Mouth: Transfer in the Acquisition of Idioms in a Second Language. , 1986 .

[13]  Nigel O'Brian,et al.  Generalizability Theory I , 2003 .

[14]  A. Cieślicka Literal salience in on-line processing of idiomatic expressions by second language learners , 2006 .

[15]  P. Bonin,et al.  Norms and comprehension times for 305 French idiomatic expressions , 2013, Behavior research methods.

[16]  F. Boers,et al.  Experimental and Intervention Studies on Formulaic Sequences in a Second Language , 2012, Annual Review of Applied Linguistics.

[17]  Helmer Strik,et al.  Analyzing and identifying multiword expressions in spoken language , 2010, Lang. Resour. Evaluation.

[18]  C. Cucchiarini,et al.  Normative data on Dutch idiomatic expressions: Native speakers , 2018 .

[19]  A. Pawley,et al.  Two puzzles for linguistic theory: nativelike selection and nativelike fluency , 2014 .

[20]  Cynthia M. Connine,et al.  Descriptive Norms for 171 Idiomatic Expressions: Familiarity, Compositionality, Predictability, and Literality , 1994 .

[21]  Samuel A. Bobrow,et al.  On catching on to idiomatic expressions , 1973, Memory & cognition.

[22]  Kathy Conklin,et al.  Formulaic Sequences: Are They Processed More Quickly than Nonformulaic Language by Native and Nonnative Speakers? , 2008 .

[23]  A. Wray Formulaic sequences in second language teaching: principle and practice , 2000 .

[24]  James R. Nattinger,et al.  Lexical Phrases and Language Teaching , 1992 .

[25]  C. Cucchiarini,et al.  Learning L2 idioms in a CALL environment: the role of practice intensity, modality, and idiom properties , 2020, Computer Assisted Language Learning.

[26]  Cristina Cacciari,et al.  When emotions are expressed figuratively: Psycholinguistic and Affective Norms of 619 Idioms for German (PANIG) , 2015, Behavior Research Methods.

[27]  Eve C. Zyzik Second language idiom learning: The effects of lexical knowledge and pedagogical sequencing , 2011 .

[28]  T. Dijkstra,et al.  How native speakers see the light , 2016 .

[29]  Normative data for idiomatic expressions , 2016, Behavior research methods.

[30]  Stefan Th. Gries,et al.  Multi-word Expressions: A Novel Computational Approach to Their Bottom-Up Statistical Extraction , 2018 .

[31]  Cristina Cacciari,et al.  Processing multiword idiomatic strings: Many words in one? , 2014 .

[32]  Lisa S Arduino,et al.  Descriptive norms for 245 Italian idiomatic expressions , 2011, Behavior research methods.

[33]  Carlos Ramisch,et al.  Multiword Expressions Acquisition: A Generic and Open Framework , 2014 .

[34]  Zoltán Kövecses,et al.  Idioms: A View from Cognitive Semantics , 1996 .

[35]  T. Dijkstra,et al.  Norming studies for idiom processing: Native and non-native benchmarks , 2016 .

[36]  A. Siyanova‐Chanturia,et al.  The idiom principle revisited , 2014 .

[37]  Alexandra A. Cleland,et al.  Familiarity breeds dissent: Reliability analyses for British-English idioms on measures of familiarity, meaning, literality, and decomposability. , 2014, Acta psychologica.

[38]  Kristin Lemhöfer,et al.  Introducing LexTALE: A quick and valid Lexical Test for Advanced Learners of English , 2011, Behavior research methods.

[39]  Suzanne Irujo,et al.  STEERING CLEAR: AVOIDANCE IN THE PRODUCTION OF IDIOMS , 1993 .

[40]  Timothy Baldwin,et al.  Multiword Expressions: A Pain in the Neck for NLP , 2002, CICLing.

[41]  Carlos Ramisch,et al.  Survey: Multiword Expression Processing: A Survey , 2017, CL.

[42]  Anna B Cieślicka,et al.  Do Nonnative Language Speakers Chew the Fat and Spill the Beans with Different Brain Hemispheres? Investigating Idiom Decomposability with the Divided Visual Field Paradigm , 2013, Journal of psycholinguistic research.

[43]  Willem J. M. Levelt,et al.  Lexical access during the production of idiomatic phrases , 2006 .

[44]  F. Boers,et al.  Formulaic sequences and perceived oral proficiency: putting a Lexical Approach to the test , 2006 .

[45]  G. Carrol,et al.  Of false friends and familiar foes: Comparing native and non-native understanding of figurative phrases , 2017 .

[46]  K Bock,et al.  That’s the way the cookie bounces: Syntactic and semantic components of experimentally elicited idiom blendsß , 1997, Memory & cognition.

[47]  A. Weber,et al.  Bilingual and Monolingual Idiom Processing Is Cut from the Same Cloth: The Role of the L1 in Literal and Figurative Meaning Activation , 2016, Front. Psychol..

[48]  E. Kellerman Transfer and Non-Transfer: Where We Are Now , 1979, Studies in Second Language Acquisition.

[49]  Frank Boers,et al.  Presenting figurative idioms with a touch of etymology: more than mere mnemonics? , 2007 .

[50]  Nelleke Oostdijk,et al.  The Construction of a 500-Million-Word Reference Corpus of Contemporary Written Dutch , 2013, Essential Speech and Language Technology for Dutch.