MWEs and Topic Modelling: Enhancing Machine Learning with Linguistics
暂无分享,去创建一个
[1] Eva Forsbom,et al. Training a super model look-alike , 2003, MTSUMMIT.
[2] Timothy Baldwin,et al. Multiword Expressions : Some Problems for Japanese NLP , 2002 .
[3] Eric Wehrli,et al. Le problème des collocations en TAL , 2006 .
[4] Sébastien Paumier. De la reconnaissance de formes linguistiques à l'analyse syntaxique. (From Pattern Matching in Text to Syntactic Parsing) , 2003 .
[5] R. Sinha,et al. Machine Translation of Bi-lingual Hindi-English (Hinglish) Text , 2005, MTSUMMIT.
[6] Dan Klein,et al. Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.
[7] R. Mahesh K. Sinha. Mining Complex Predicates In Hindi Using A Parallel Hindi-English Corpus , 2009, MWE@IJCNLP.
[8] Pavel Rychlý,et al. Manatee, Bonito and Word Sketches for Czech , 2004 .
[9] Darren Pearce. A Comparative Evaluation of Collocation Extraction Techniques , 2002, LREC.
[10] Timothy Baldwin,et al. Multiword expressions: linguistic precision and reusability , 2002, LREC.
[11] David Yarowsky,et al. One Sense per Collocation , 1993, HLT.
[12] Eneko Agirre,et al. Word Sense Disambiguation: Algorithms and Applications (Text, Speech and Language Technology) , 2006 .
[13] Ted Pedersen,et al. Word Sense Discrimination by Clustering Contexts in Vector and Similarity Spaces , 2004, CoNLL.
[14] Yves Scherrer,et al. Deep Linguistic Multilingual Translation and Bilingual Dictionaries , 2009, WMT@EACL.
[15] Carlos Ramisch,et al. Towards the Construction of Language Resources for Greek Multiword Expressions: Extraction and Evaluation , 2010, LREC 2010.
[16] Jim Breen,et al. JMdict: a Japanese-Multilingual Dictionary , 2004 .
[17] Eric Laporte,et al. A French Corpus Annotated for Multiword Expressions with Adverbial Function , 2008, LAW II 2008.
[18] Jonas Kuhn,et al. Exploiting Translational Correspondences for Pattern-Independent MWE Identification , 2009, MWE@IJCNLP.
[19] Satoshi Shirai,et al. Toward an MT System without Pre-Editing - Effects of New Methods in ALT-J/E - , 1995, ArXiv.
[20] Yuji Matsumoto,et al. Combining resources for open source machine translation , 2007, TMI.
[21] Setsuo Yamada,et al. Corpus-Assisted Expansion of Manual MT Knowledge , 2002 .
[22] Stefan Evert,et al. The Statistics of Word Cooccur-rences: Word Pairs and Collocations , 2004 .
[23] Eric Wehrli,et al. Fips, A “Deep” Linguistic Multilingual Parser , 2007, ACL 2007.
[24] J. Murray. Oxford Collocations Dictionary for Students of English , 2003 .
[25] Timothy Baldwin,et al. Multiword Expressions: A Pain in the Neck for NLP , 2002, CICLing.
[26] T. Mohanan. Argument structure in Hindi , 1994 .
[27] Björn-Olav Dozo,et al. Quantitative Analysis of Culture Using Millions of Digitized Books , 2010 .
[28] Carlos Ramisch,et al. Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World, MWE@ACL 2011, Portland, Oregon, USA, June 23, 2011 , 2011, MWE@ACL.
[29] Hermann Ney,et al. A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.
[30] Adam Kilgarriff,et al. Large Linguistically-Processed Web Corpora for Multiple Languages , 2006, EACL.
[31] S. C. Kohs,et al. The vocabulary test as a measure of intelligence. , 1918 .
[32] Dawn Archer,et al. Extracting Multiword Expressions with A Semantic Tagger , 2003, ACL 2003.
[33] Suzanne Stevenson,et al. Statistical Measures of the Semi-Productivity of Light Verb Constructions , 2004 .
[34] Yuji Matsumoto,et al. Applying Conditional Random Fields to Japanese Morphological Analysis , 2004, EMNLP.
[35] Ray Jackendoff,et al. The Architecture of the Language Faculty , 1996 .
[36] P. McCullagh. Estimating the Number of Unseen Species: How Many Words did Shakespeare Know? , 2008 .
[37] L. Danlos,et al. Translation in the predicative element of a sentence: category switiching, aspect and diathesis , 1992, TMI.
[38] Hugh E. Williams,et al. The Zettair Search Engine , 1998 .
[39] Timothy Baldwin,et al. An Empirical Model of Multiword Expression Decomposability , 2003, ACL 2003.
[40] Peter Edwin Hook,et al. The compound verb in Hindi , 1976 .
[41] Samuel Reese,et al. FreeLing 2.1: Five Years of Open-source Language Processing Tools , 2010, LREC.
[42] David Yarowsky,et al. Using Large Monolingual and Bilingual Corpora to Improve Coordination Disambiguation , 2011, ACL.
[43] Pushpak Bhattacharyya,et al. Hindi Compound Verbs and their Automatic Extraction , 2008, COLING.
[44] Mark Dras,et al. Automatic Identification of Support Verbs: A Step Towards a Definition of Semantic Weight , 1995, ArXiv.
[45] Stefan Evert,et al. Using small random samples for the manual evaluation of statistical association measures , 2005, Comput. Speech Lang..
[46] Eric Wehrli,et al. Creating a multilingual collocations dictionary from large text corpora , 2003, EACL.
[47] David Wible,et al. A Method for Unsupervised Broad-Coverage Lexical Error Detection and Correction , 2009, BEA@NAACL.
[48] Victoria Arranz,et al. Multiwords and Word Sense Disambiguation , 2005, CICLing.
[49] Aravind K. Joshi,et al. Relative Compositionality of Multi-word Expressions: A Study of Verb-Noun (V-N) Collocations , 2005, IJCNLP.
[50] O. Jespersen. A modern English grammar on historical principles , 1928 .
[51] Dan I. Moldovan,et al. Word sense disambiguation of WordNet glosses , 2004, Comput. Speech Lang..
[52] Eric Nichols,et al. Deep open-source machine translation , 2011, Machine Translation.
[53] Kenneth Ward Church,et al. Text Analysis and Word Pronunciation in Text-to-speech Synthesis , 2013 .
[54] Aline Villavicencio,et al. Statistically-Driven Alignment-Based Multiword Expression Identification for Technical Domains , 2009, MWE@IJCNLP.
[55] Yves Lepage,et al. Sampling-based Multilingual Alignment , 2009, RANLP.
[56] Christiane Fellbaum,et al. Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.
[57] Tim van de Cruys,et al. Semantics-based Multiword Expression Extraction , 2007 .
[58] Violeta Seretan,et al. An integrated environment for extracting and translating collocations , 2009 .
[59] Violeta Seretan,et al. Syntax-Based Extraction , 2011 .
[60] Oliver Christ,et al. A Modular and Flexible Architecture for an Integrated Corpus Query System , 1994, ArXiv.
[61] Karen Kukich,et al. Techniques for automatically correcting words in text , 1992, CSUR.
[62] Iris Hendrickx,et al. Complex Predicates Annotation in a Corpus of Portuguese , 2010, Linguistic Annotation Workshop.
[63] Mark Steedman,et al. The syntactic process , 2004, Language, speech, and communication.
[64] Ian H. Witten,et al. Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.
[65] Tanja Samardžić,et al. Cross-Lingual Variation of Light Verb Constructions: Using Parallel Corpora and Automatic Alignment for Linguistic Research , 2010 .
[66] Morris Salkoff,et al. Automatic translation of support verb constructions , 1990, COLING.
[67] Adam Kilgarriff,et al. The Sketch Engine , 2004 .
[68] Kenneth Ward Church,et al. Morphology and rhyming: two powerful alternatives to letter-to-sound rules for speech synthesis , 1990, SSW.
[69] Kim Nam Su,et al. Statistical modeling of multiword expressions , 2008 .
[70] Aline Villavicencio,et al. UFRGS@CLEF2008: Indexing Multiword Expressions for Information Retrieval , 2008, CLEF.
[71] Chris Callison-Burch,et al. Scaling Phrase-Based Statistical Machine Translation to Larger Corpora and Longer Phrases , 2005, ACL.
[72] Frederick Jelinek,et al. Some of my Best Friends are Linguists , 2005, Lang. Resour. Evaluation.
[73] Richard Sproat. English noun-phrase accent prediction for text-to-speech , 1994, Comput. Speech Lang..
[74] M. Barlow. ParaConc : Concordance Software for Multilingual Parallel Corpora , 2002 .
[75] Mark Johnson,et al. Unsupervised learning of multi-word verbs , 2001 .
[76] Paul Procter,et al. Cambridge international dictionary of English , 2000 .
[77] Ian H. Witten,et al. The WEKA data mining software: an update , 2009, SKDD.
[78] Andrei Broder,et al. A taxonomy of web search , 2002, SIGF.
[79] Michele Banko,et al. Scaling to Very Very Large Corpora for Natural Language Disambiguation , 2001, ACL.
[80] David Wible,et al. StringNet as a Computational Resource for Discovering and Investigating Linguistic Constructions , 2010, HLT-NAACL 2010.
[81] Helmut Schmidt,et al. Probabilistic part-of-speech tagging using decision trees , 1994 .
[82] Timothy Baldwin,et al. Interpretation of Compound Nominalisations using Corpus and Web Statistics , 2006 .
[83] James Rogers. Capturing CFLs with Tree Adjoining Grammars , 1994, ACL.
[84] F. Mosteller,et al. Inference and Disputed Authorship: The Federalist , 1966 .
[85] George A. Miller,et al. WordNet: A Lexical Database for English , 1995, HLT.
[86] Aravind K. Joshi,et al. Using Information about Multi-word Expressions for the Word-Alignment Task , 2006 .
[87] Masaki Murata,et al. Multilingual Aligned Parallel Treebank Corpus Reflecting Contextual Information and Its Applications , 2004 .
[88] J. Silva,et al. A Local Maxima method and a Fair Dispersion Normalization for extracting multi-word units from corpora , 2009 .
[89] J. R. Firth,et al. A Synopsis of Linguistic Theory, 1930-1955 , 1957 .
[90] Jan Tore Lønning,et al. Towards hybrid quality-oriented machine translation – on linguistics and probabilities in MT , 2007, TMI.
[91] Anabela Barreiro,et al. ReEscreve: a translator-friendly multi-purpose paraphrasing software tool , 2009 .
[92] José Gabriel Pereira Lopes,et al. Language Independent Automatic Acquisition of Rigid Multiword Units from Unrestricted Text Corpora , 1999 .
[93] Stefan Evert,et al. Experiments on Candidate Data for Collocation Extraction , 2003, EACL.
[94] Timothy Baldwin,et al. Multiword Expressions , 2010, Handbook of Natural Language Processing.
[95] Kenneth Ward Church,et al. Enhanced Good-Turing and Cat-Cal: Two New Methods for Estimating Probabilities of English Bigrams (abbreviated version) , 1989, HLT.
[96] Uri Zernik,et al. Lexical acquisition: Exploiting on-line resources to build a lexicon. , 1991 .
[97] Leo Breiman,et al. Bagging Predictors , 1996, Machine Learning.
[98] Carlos Ramisch,et al. mwetoolkit: a Framework for Multiword Expression Identification , 2010, LREC.
[99] Eric Brill,et al. Spelling Correction as an Iterative Process that Exploits the Collective Knowledge of Web Users , 2004, EMNLP.
[100] Adam Kilgarriff,et al. Language is never, ever, ever, random , 2005 .
[101] Beatrice Daille,et al. Combined approach for terminology extraction: lexical statistics and linguistic filtering , 1995 .
[102] Mike Scott. Wordsmith Tools version 3 , 1997 .
[103] Kenneth Ward Church,et al. Termight: Identifying and Translating Technical Terminology , 1994, ANLP.
[104] Timothy Baldwin,et al. Extracting the Unextractable: A Case Study on Verb-particles , 2002, CoNLL.
[105] Carlos Ramisch,et al. Web-based and combined language models: a case study on noun compound identification , 2010, COLING.
[106] Stefan Langer,et al. A Formal Specification of Support Verb Constructions , 2009 .
[107] Joakim Nivre,et al. Multiword Units in Syntactic Parsing , 2004 .
[108] Dan Klein,et al. Accurate Unlexicalized Parsing , 2003, ACL.
[109] Ted Dunning,et al. Accurate Methods for the Statistics of Surprise and Coincidence , 1993, CL.
[110] Aline Villavicencio,et al. Introduction to the special issue on multiword expressions: Having a crack at a hard nut , 2005, Comput. Speech Lang..
[111] L. F. L. Cintra,et al. Crónica geral de Espanha de 1344 , 1952 .
[112] Y. Tanaka,et al. Compilation of a multilingual parallel corpus , 2001 .
[113] Angelika Storrer,et al. Multiword Lexemes: A Monolingual and Contrastive Typology for NLP and MT , 1992, IWBS Report.
[114] Frank Smadja,et al. Retrieving Collocations from Text: Xtract , 1993, CL.
[115] Donald R. Morrison,et al. PATRICIA—Practical Algorithm To Retrieve Information Coded in Alphanumeric , 1968, J. ACM.
[116] Pavel Pecina. Lexical Association Measures: Collocation Extraction , 2008 .
[117] Archna Bhatia,et al. PropBank Annotation of Multilingual Light Verb Constructions , 2010, Linguistic Annotation Workshop.
[118] Stefan Langer,et al. A linguistic test battery for support verb constructions , 2004 .
[119] Karen Sparck Jones. What is the Role of NLP in Text Retrieval , 1999 .
[120] Andreas Stolcke,et al. Entropy-based Pruning of Backoff Language Models , 2000, ArXiv.
[121] Aline Villavicencio,et al. Lexical Encoding of MWEs , 2004 .
[122] Paul Rayson. Wmatrix : a web-based corpus processing environment , 2022 .
[123] Christopher R. Johnson,et al. Lexicographic Relevance: Selecting Information From Corpus Evidence , 2003 .
[124] German Rigau,et al. The TALP systems for disambiguating WordNet glosses , 2004, SENSEVAL@ACL.
[125] Pavel Pecina,et al. Lexical association measures and collocation extraction , 2009, Lang. Resour. Evaluation.
[126] Hilda Monetto Flores da Silva. VERBOS-SUPORTE OU EXPRESSÕES LEXICALIZADAS? , 2009 .
[127] Sue Atkins. The DANTE Database: Its Contribution to English Lexical Research, and in Particular to Complementing the FrameNet Data , 2010, A Way with Words.
[128] Graça Rio-Torto,et al. O Léxico : semântica e gramática das unidades lexicais , 2006 .
[129] Shuly Wintner,et al. Identifying Multi-word Expressions by Leveraging Morphological and Syntactic Idiosyncrasy , 2010, COLING.
[130] Kenneth Ward Church,et al. Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.
[131] Christopher D. Manning,et al. Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger , 2000, EMNLP.
[132] Simone Teufel,et al. Corpus-based Method for Automatic Identification of Support Verbs for Nominalizations , 1995, EACL.
[133] Amitabha Mukerjee,et al. Detecting Complex Predicates in Hindi using POS Projection across Parallel Corpora , 2006 .
[134] Simon Charest,et al. Élaboration automatique d’un dictionnaire de cooccurrences grand public , 2007, JEPTALNRECITAL.
[135] Carlos Ramisch,et al. Alignment-based extraction of multiword expressions , 2010, Lang. Resour. Evaluation.
[136] Dan Flickinger,et al. Minimal Recursion Semantics: An Introduction , 2005 .
[137] David Yarowsky,et al. One Sense Per Discourse , 1992, HLT.
[138] Carlos Ramisch,et al. Multiword Expressions in the wild? The mwetoolkit comes in handy , 2010, COLING.
[139] Gerlof Bouma. Collocation Extraction beyond the Independence Assumption , 2010, ACL.
[140] Pushpak Bhattacharyya,et al. Verbs are where all the action lies: Experiences of Shallow Parsing of a Morphologically Rich Language , 2010, COLING.
[141] M. Tomasello. Regularity and Idiomaticity in Grammatical Constructions: The Case of Let Alone , 2003 .
[142] Aravind K. Joshi,et al. Tree-Adjoining Grammars , 1997, Handbook of Formal Languages.
[143] Ken Ward Church,et al. Using Word-Sense Disambiguation Methods to Classify Web Queries by Intent , 2009, EMNLP.
[144] Miriam Butt. The Structure of Complex Predicates in Urdu , 1995 .
[145] Ralph Grishman,et al. Towards Best Practice for Multiword Expressions in Computational Lexicons , 2002, LREC.
[146] Jörg Tiedemann,et al. Identifying idiomatic expressions using automatic word-alignment , 2006 .
[147] Afsaneh Fazly,et al. Pulling their Weight: Exploiting Syntactic Forms for the Automatic Identification of Idiomatic Expressions in Context , 2007 .
[148] Stefan Evert,et al. Methods for the Qualitative Evaluation of Lexical Association Measures , 2001, ACL.
[149] Kenneth Ward Church. Empirical Estimates of Adaptation: The chance of Two Noriegas is closer to p/2 than p2 , 2000, COLING.
[150] Ataliba Teixeira de Castilho,et al. Grámatica do português falado , 1990 .
[151] R. Mahesh K. Sinha. Learning Disambiguation of Hindi Morpheme "vaalaa' with a Sparse Corpus , 2009, 2009 International Conference on Machine Learning and Applications.
[152] K. Sinha,et al. Dealing with Replicative Words in Hindi for Machine Translation to English , 2005, MTSUMMIT.
[153] Yi Zhang,et al. Towards Domain-Independent Deep Linguistic Processing: Ensuring Portability and Re-Usability of Lexicalised Grammars , 2008, COLING 2008.
[154] Yuji Matsumoto,et al. Feedback Cleaning of Machine Translation Rules Using Automatic Evaluation , 2003, ACL.
[155] Miriam Butt. The Light Verb Jungle , 2003 .
[156] Andy Way. A hybrid architecture for robust MT using LFG-DOP , 1999, J. Exp. Theor. Artif. Intell..
[157] Pavel Pecina,et al. Combining Association Measures for Collocation Extraction , 2006, ACL.
[158] Colin Bannard. A Measure of Syntactic Flexibility for Automatically Identifying Multiword Expressions in Corpora , 2007 .
[159] Dustin Boswell. UCSD Research Exam (Summer 2004) "Speling Korecksion: A Survey of Techniques from Past to Present" (Final Draft). , 2005 .
[160] Satanjeev Banerjee,et al. The Design, Implementation, and Use of the Ngram Statistics Package , 2003, CICLing.
[161] Hang Cui,et al. Extending corpus-based identification of light verb constructions using a supervised learning framework , 2006 .
[162] Mark A. Finlayson,et al. Source code and data for MWE'2011 papers , 2011 .
[163] Ellen M. Voorhees,et al. Evaluating Evaluation Measure Stability , 2000, SIGIR 2000.
[164] Hinrich Schütze,et al. Introduction to information retrieval , 2008 .
[165] Joseph D. Becker. The Phrasal Lexicon , 1975, TINLAP.
[166] Y. Bar-Hillel. A Quasi-Arithmetical Notation for Syntactic Description , 1953 .
[167] Doug Beeferman,et al. Say what? why users choose to speak their web queries , 2010, INTERSPEECH.
[168] Timothy Baldwin,et al. Road-testing the English Resource Grammar Over the British National Corpus , 2004, LREC.
[169] Ted Pedersen,et al. Significant Lexical Relationships , 1996, AAAI/IAAI, Vol. 1.
[170] Satoshi Shirai,et al. Construction of a Dictionary for Translating Japanese Phrases into One English Word , 2001 .
[171] Philipp Koehn,et al. Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.
[172] Aline Villavicencio,et al. Automated Multiword Expression Prediction for Grammar Engineering , 2006 .
[173] Martin F. Porter,et al. An algorithm for suffix stripping , 1997, Program.