Matrix : a statistical method and software tool for linguistic analysis through corpus comparison
暂无分享,去创建一个
[1] Hans Martin Lehmann,et al. Collocational Evidence from the British National Corpus , 2000, Corpora Galore.
[2] Geoffrey Leech,et al. Grammatical word class variation within the British National Corpus sampler , 2002 .
[3] Tony McEnery,et al. A Corpus/annotation toolbox , 1997 .
[4] J. R. Firth,et al. THE TECHNIQUE OF SEMANTICS. , 1935 .
[5] Ludovic Lebart,et al. Exploring Textual Data , 1997 .
[6] Nelleke Oostdijk,et al. Corpus Linguistics and the Automatic Analysis of English , 1991 .
[7] John Sinclair,et al. Looking up : an account of the COBUILD Project in lexical computing and the development of the Collins COBUILD English Language Dictionary , 1987 .
[8] Hamish Cunningham. GATE, a General Architecture for Text Engineering , 2002 .
[9] Paul Rayson,et al. How to generalise the task of annotation , 1997 .
[10] S Hockey. Concordance Programs for Corpus Linguistics , 2001 .
[11] Elena Tognini-Bonelli,et al. Corpus Linguistics at Work , 2002, Computational Linguistics.
[12] Christopher S. Butler,et al. Statistics in linguistics , 1985 .
[13] Bas Aarts,et al. Exploring Natural Language: Working with the British Component of the International Corpus of English , 2002 .
[14] Eric Atwell,et al. Dealing with ill-formed English text , 1987 .
[15] S. Jones,et al. English lexical collocations - A study in computational linguistics , 1974 .
[16] J. R. Firth,et al. A Synopsis of Linguistic Theory, 1930-1955 , 1957 .
[17] George Kingsley Zipf,et al. Human Behaviour and the Principle of Least Effort: an Introduction to Human Ecology , 2012 .
[18] D. Biber,et al. Drift and the Evolution of English Style: A History of Three Genres , 1989 .
[19] Timothy R. C. Read,et al. Multinomial goodness-of-fit tests , 1984 .
[20] Antoinette Renouf. Explorations in Corpus Linguistics , 1998 .
[21] G. Yule,et al. The statistical study of literary vocabulary , 1944 .
[22] H. Kucera,et al. Computational analysis of present-day American English , 1967 .
[23] Irving Lorge,et al. The semantic count of the 570 commonest English words , 1949 .
[24] Susan Conrad,et al. Corpus Linguistics: Investigating Language Structure and Use , 1998 .
[25] Larry Wall,et al. Programming Perl - covers Perl 5, 2nd Edition , 1996, A nutshell handbook.
[26] C Snow,et al. Child language data exchange system , 1984, Journal of Child Language.
[27] Kyo Kageura,et al. Bigram Statistics Revisited: A Comparative Examination of Some Statistical Measures in Morphological Analysis of Japanese Kanji Sequences , 1999, J. Quant. Linguistics.
[28] G. Leech,et al. Social differentiation in the use of English vocabulary: some analyses of the conversational component of the British National Corpus , 1997 .
[29] Paul Rayson,et al. Automatic Content Analysis of Spoken Discourse , 1992 .
[30] Hans Peter Luhn,et al. The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..
[31] Geoffrey Leech,et al. CLAWS4: The Tagging of the British National Corpus , 1994, COLING.
[32] Gunnel Tottie,et al. English in speech and writing : a symposium , 1986 .
[33] Erik Smitterberg,et al. International Corpus of Learner English , 2004 .
[34] A. Woods,et al. Statistics in Language Studies , 1986 .
[35] Gregory P. Knowles,et al. Manual of information to accompany the SEC corpus , 1988 .
[36] Glyn Jones,et al. Concordances in the Classroom , 1990 .
[37] Marc Weeber,et al. Extracting the lowest-frequency words: pitfalls and possibilities , 2000, CL.
[38] Frank Yates. Contingency tables involving small numbers and the chi-squared test , 1934 .
[39] Geoffrey Rockwell,et al. Tactweb: The Intersection of Text-Analysis and Hypertext , 1997 .
[40] Sidney Greenbaum,et al. Comparing English worldwide : the International Corpus of English , 1996 .
[41] Mohsen Ghadessy,et al. Small corpus studies and ELT : theory and practice , 2001 .
[42] Michael Oakes,et al. Statistics for Corpus Linguistics , 1998 .
[43] Michael McCarthy,et al. Vocabulary: Description, Acquisition and Pedagogy , 1990 .
[44] Carl Gutwin,et al. Domain-Specific Keyphrase Extraction , 1999, IJCAI.
[45] Wolfgang Lezius,et al. An XML-based Representation Format for Syntactically Annotated Corpora , 2000, LREC.
[46] Nancy Ide,et al. Corpues enconding standard: SGML guidelines for encoding linguistic corpora , 1998, LREC.
[47] Timothy R. C. Read,et al. Goodness-Of-Fit Statistics for Discrete Multivariate Data , 1988 .
[48] Paul Rayson,et al. The ACAMRIT semantic tagging system: progress report , 1996 .
[49] C. Chapelle. The Computational Analysis of English—A Corpus‐Based Approach , 1988 .
[50] John Bibby,et al. The Analysis of Contingency Tables , 1978 .
[51] Oliver Christ,et al. A Modular and Flexible Architecture for an Integrated Corpus Query System , 1994, ArXiv.
[52] John Sinclair,et al. Corpus, Concordance, Collocation , 1991 .
[53] S. Hockey. Electronic Texts in the Humanities , 2000 .
[54] Barbara Lewandowska-Tomaszczyk,et al. PALC'99--Practical Applications in Language Corpora : papers from the international conference at the University of Łódź, 15-18 April 1999 , 2000 .
[55] Robert J. Gaizauskas,et al. Coupling information retrieval and information extraction: A new text technology for gathering information from the web , 1997, RIAO.
[56] R. Harald Baayen,et al. Statistical models for word frequency distributions: A linguistic evaluation , 1992, Comput. Humanit..
[57] Michael A. West,et al. A general service list of English words, with semantic frequencies and a supplementary word-list for the writing of popular science and technology , 1953 .
[58] R. Harald Baayen,et al. Word Frequency Distributions , 2001 .
[59] Douglas Biber,et al. Dimensions of Register Variation: A Cross-Linguistic Comparison , 1995 .
[60] Clive Souter,et al. Corpus-Based Computational Linguistics , 1993 .
[61] Thomas G. Dietterich. Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.
[62] Martin Weisser. Programming for Corpus Linguistics: How to Do Text Analysis with Java , 2001 .
[63] Mike Scott,et al. 3. Comparing corpora and identifying key words, collocations, and frequency distributions through the WordSmith Tools suite of computer programs , 2001 .
[64] Geoffrey Leech,et al. Running a grammar factory: The production of syntactically analysed corpora or “treebanks” , 1991 .
[65] W. G. Cochran. Some Methods for Strengthening the Common χ 2 Tests , 1954 .
[66] William B. Stiles,et al. Describing talk : a taxonomy of verbal response modes , 1992 .
[67] Terry Winograd,et al. Understanding natural language , 1974 .
[68] S. Dawson. Keywords: a Vocabulary of Culture and Society , 1976 .
[69] Geoffrey Leech,et al. Using corpora for language research : studies in the honour of Geoffrey Leech , 1996 .
[70] Karen Sparck Jones. Automatic keyword classification for information retrieval , 1971 .
[71] A. Agresti,et al. Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.
[72] Paul Rayson,et al. Template analysis: bridging the gap between grammar and the lexicon , 1996 .
[73] Sylviane Granger,et al. Automatic Profiling of Learner Texts , 1998 .
[74] Susan Hunston,et al. Corpora in Applied Linguistics , 2002 .
[75] Douglas Biber,et al. Variation across speech and writing: Methodology , 1988 .
[76] Mike Scot. REVIEW OF MONOCONC PRO AND WORDSMITH TOOLS , 2001 .
[77] Russell V. Lenth,et al. Computer Intensive Methods for Testing Hypotheses: An Introduction , 1990 .
[78] Ian H. Witten,et al. Lexically-generated subject hierarchies for browsing large collections , 1999, International Journal on Digital Libraries.
[79] Atro Voutilainen. A Short History of Tagging , 1999 .
[80] Stig Johansson,et al. Some aspects of the vocabulary of learned and scientific English , 1978 .
[81] Brett Kessler,et al. Book Reviews: The Significance of Word Lists , 2001, CL.
[82] Raymond Williams. Keywords: A Vocabulary of Culture and Society , 1976 .
[83] Paul Rayson,et al. Comparing Corpora using Frequency Profiling , 2000, Proceedings of the workshop on Comparing corpora -.
[84] Adam Kilgarriff,et al. Using Word Frequency Lists to Measure Corpus Homogeneity and Similarity between Corpora , 1997, VLC.
[85] Michael McGill,et al. Introduction to Modern Information Retrieval , 1983 .
[86] Shishir Gundavaram,et al. CGI Programming on the World Wide Web , 1996 .
[87] Geoffrey Leech,et al. The Use of Tagging , 1999 .
[88] Daniel Jurafsky,et al. Verb Subcategorization Frequency Differences between Business- News and Balanced Corpora: The Role of Verb Sense , 2000, ACL 2000.
[89] John Strang. Programming with Curses , 1986 .
[90] Geoffrey Sampson,et al. English for the Computer: The SUSANNE Corpus and Analytic Scheme , 1995, Computational Linguistics.
[91] Ian Sommerville,et al. MOG User Interface Builder: A Mechanism for Integrating Application and User Interface , 1993, Interact. Comput..
[92] George R. Doddington. CSR Corpus Development , 1992, HLT.
[93] Tony McEnery,et al. Swearing and abuse in modern British English , 2000 .
[94] Alphonse G. Juilland,et al. Frequency dictionary of Rumanian words , 1964 .
[95] Geoffrey Leech,et al. Corpus Annotation: Linguistic Information from Computer Text Corpora , 1997 .
[96] Timothy R. C. Read,et al. Pearsons-X2 and the loglikelihood ratio statistic-G2: a comparative review , 1989 .
[97] Magnus Ljung,et al. A frequency dictionary of English morphemes , 1974 .
[98] Douglas Biber,et al. Representativeness in corpus design , 1993 .
[99] Sylviane Granger,et al. The computer learner corpus: a versatile new source of data for SLA research , 1998 .
[100] Charles Carpenter Fries,et al. English word lists : a study of their adaptability for instruction , 1965 .
[101] Jan Svartvik,et al. Directions in corpus linguistics : proceedings of Nobel Symposium 82, Stockholm, 4-8 August 1991 , 1992 .
[102] Adam Kilgarriff,et al. Which words are particularly characteristic of a text? a survey of statistical approaches , 1996 .
[103] M. Stubbs. British Traditions in Text Analysis — From Firth to Sinclair , 1993 .
[104] Adam Kilgarriff,et al. Measures for Corpus Similarity and Homogeneity , 1998, EMNLP.
[105] C. Mehta,et al. A network algorithm for the exact treatment of Fisher's exact test in RxC contingency tables , 1983 .
[106] Ian Marshall,et al. Choice of grammatical word-class without global syntactic analysis: Tagging words in the lob corpus , 1983, Comput. Humanit..
[107] Christopher S. Butler,et al. Computers and written texts , 1992 .
[108] Colin Good. Attitudes Towards Europe: Language in the Unification Process , 2001 .
[109] John Sinclair. Corpus typology : a framework for classification , 1995 .
[110] G. Leech,et al. Word Frequencies in Written and Spoken English: based on the British National Corpus , 2001 .
[111] Sylviane Granger,et al. Learner English on Computer , 1998 .
[112] Sylviane Granger,et al. The International Corpus of Learner English , 1993 .
[113] Andrew Wilson. Towards an Integration of Content Analysis and Discourse Analysis: The Automatic Linkage of Key Relations in Text , 1993 .
[114] Stig Johansson,et al. English computer corpora : selected papers and research guide , 1991 .
[115] M. Stubbs. Text and Corpus Analysis: Computer-Assisted Studies of Language and Culture , 1996 .
[116] Mike Scott,et al. Mapping key words to problem and solution , 2001 .
[117] Anthony McEnery,et al. Parallel alignment in English and Chinese , 2000 .
[118] Geoffrey Leech,et al. Standards for Tagsets. , 1999 .
[119] L. Burnard,et al. Genres, keywords, teaching: towards a pedagogic account of the language of project proposals , 2000 .
[120] R Kawecki,et al. The use of an on-line trilingual corpus for the teaching of reading comprehension in French , 2001 .
[121] Ted Pedersen,et al. Significant Lexical Relationships , 1996, AAAI/IAAI, Vol. 1.
[122] Anthony McEnery,et al. Rethinking Language Pedagogy from a Corpus Perspective: Papers from the Third International Conference on Teaching and Language Corpora , 2000 .
[123] Jeremy Fox. Computers in English language teaching and research: Leech, Geoffrey and Candlin, Christopher N. (eds.), London: Longman, 1986, 230 pp., £5.90. (Applied Linguistics and Language Study) , 1986 .
[124] Toru Hisamitsu,et al. Extracting useful terms from parenthetical expressions by combining simple rules and statistical measures: A comparative evaluation of bigram statistics , 2001 .
[125] F. Yates,et al. Tests of Significance for 2 × 2 Contingency Tables , 1984 .
[126] Tony McEnery. Database Design For Corpus Storage : The ET 1063 Data Model , 1993 .
[127] Nicola Guarino,et al. Formal ontology, conceptual analysis and knowledge representation , 1995, Int. J. Hum. Comput. Stud..
[128] Mick Short,et al. Using Corpora for Language Research , 1998 .
[129] Tony McEnery,et al. Multilingual resources for European languages: contributions of the CRATER project , 1997 .
[130] John B. Carroll,et al. The American Heritage Word Frequency Book , 1971 .
[131] Lorna Hughes,et al. CTI Centre for Textual Studies Resources Guide , 1994 .
[132] Hamish Cunningham,et al. A definition and short history of Language Engineering , 1999, Natural Language Engineering.
[133] Geoffrey Barnbrook. Language and Computers: A Practical Introduction to the Computer Analysis of Language , 1996 .
[134] Alphonse G. Juilland,et al. Frequency dictionary of French words , 1971 .
[135] Anne Wichmann,et al. Teaching and Language Corpora , 1997 .
[136] K. Sin,et al. Language engineering for legal transplantation: Conceptual problems in creating common law Chinese , 1996 .
[137] Mona Baker,et al. Text and technology : in honour of John Sinclair , 1993 .
[138] Roger Garside,et al. A Probabilistic Parser , 1985, EACL.
[139] Richard Jones. Creating and using a corpus of spoken German , 1997 .
[140] Paul Rayson,et al. Higher-level annotation tools , 1997 .
[141] Ted Dunning,et al. Accurate Methods for the Statistics of Surprise and Coincidence , 1993, CL.
[142] Kalina Bontcheva,et al. Software Infrastructure for Language Resources: a Taxonomy of Previous Work and a Requirements Analysis , 2000, LREC.
[143] Anthony A. Lyne. The vocabulary of French business correspondence , 1985 .
[144] Carl W. Roberts,et al. Text analysis for the social sciences : methods for drawing statistical inferences from texts and transcripts , 1997 .
[145] John Bradley,et al. Using Tact With Electronic Texts: A Guide to Text-Analysis Computing Tools : Version 2.1 for MS-DOS and PC DOS , 1996 .
[146] P.J.M. de Haan,et al. Corpus-based research into language. In honour of Jan Aarts , 1994 .
[147] Nancy Priest-Dorman Greg Ide,et al. Corpus Encoding Standard (CES) , 2000 .
[148] Stig Johansson. Word frequency and text type: Some observations based on the LOB corpus of British English texts , 1985, Comput. Humanit..
[149] Signe Oksefjell Ebeling,et al. Out of Corpora , 1999 .
[150] Ken Williams,et al. The Failure of Pearson's Goodness of Fit Statistic , 1976 .
[151] Ted Pedersen,et al. Fishing for Exactness , 1996, ArXiv.
[152] David Yarowsky,et al. One Sense Per Discourse , 1992, HLT.
[153] Geoffrey Leech,et al. Spoken English on Computer: Transcription, Mark-Up and Application , 1995 .
[154] W. Nelson Francis,et al. FREQUENCY ANALYSIS OF ENGLISH USAGE: LEXICON AND GRAMMAR , 1983 .
[155] G. Leech. 100 million words of English , 1993, English Today.
[156] Pam Peters,et al. New frontiers of corpus research: papers from the Twenty First International Conference on English Language Research on Computerized Corpora Sydney 2000 , 2002 .
[157] Vincent Ooi,et al. Collocations in Singaporean-Malaysian English , 2000 .
[158] Elena Semino,et al. Using a corpus for stylistics research : speech presentation. , 1996 .
[159] Hans van Halteren,et al. Improving Data Driven Wordclass Tagging by System Combination , 1998, ACL.
[160] Beatrice Santorini,et al. Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.
[161] John M. Kirk. Corpora galore : analyses and techniques in describing English : papers from the nineteenth International Conference on English Language Research on Computerised Corpora (ICAME 1998) , 2000 .
[162] Ralph Grishman,et al. Computational linguistics : an introduction , 1986 .
[163] Heles Contreras,et al. Frequency Dictionary of Spanish Words , 1964 .
[164] Atro Voutilainen,et al. A language-independent system for parsing unrestricted text , 1995 .
[165] S. Johansson,et al. Word Frequencies in British and American English , 1985 .
[166] Bas Aarts,et al. The verb in contemporary English , 1995 .
[167] Johansson. Stig,et al. Manual of information to accompany the Lancaster-Oslo : Bergen Corpus of British English, for use with digital computers , 1978 .
[168] Benny Brodda. Doing corpus work with PC Beta; or, how to be your own computational linguist , 1991 .
[169] Roel Popping. Computer Programs for the Analysis of Texts and Transcripts , 1997 .
[170] Scott Deerwester,et al. English in computer science : a corpus-based lexical analysis , 1994 .
[171] C. D. Paice. Information retrieval and the computer , 1977 .
[172] Anthony McEnery,et al. Multilingual Corpora In Teaching And Research. , 2000 .
[173] Mike Scott,et al. PC analysis of key words — And key key words , 1997 .
[174] Catherine N. Ball. Automated Text Analysis: Cautionary Tales , 1993 .
[175] B. MacWhinney. The CHILDES project: tools for analyzing talk , 1992 .
[176] Jeremy M. R. Martin,et al. The Oxford Concordance Program Version 2 , 1987 .
[177] H. Dahl. Word frequencies of spoken American English , 1979 .
[178] Gunnel Melchers,et al. Studies in Anglistics , 1995 .
[179] Peter Sawyer,et al. Assisting requirements engineering with semantic document analysis , 2000, RIAO.
[180] R. Schiffer. Psychobiology of Language , 1986 .
[181] G. Francis. A Corpus-Driven Approach to Grammar — Principles, Methods and Examples , 1993 .
[182] Alexander S. Yeh,et al. More accurate tests for the statistical significance of result differences , 2000, COLING.
[183] H. P. Edmundson,et al. New Methods in Automatic Extracting , 1969, JACM.
[184] Sylvie De Cock,et al. A Recurrent Word Combination Approach to the Study of Formulae in the Speech of Native and Non-Native Speakers of English , 1998 .
[185] Geoffrey Leech,et al. Introducing corpus annotation , 1997 .
[186] Joe Zhou,et al. Phrasal Terms in Real-World IR Applications , 1999 .