Building Linguistic Corpora from Wikipedia Articles and Discussions
暂无分享,去创建一个
[1] Uli Kutter,et al. Literatur. , 1941, Subjekt.
[2] Christiane Fellbaum,et al. Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.
[3] Steven John Metsker. The Design Patterns Java Workbook , 2002 .
[4] Giuseppe Attardi,et al. Semantically Annotated Snapshot of the English Wikipedia , 2008, LREC.
[5] Nancy Ide,et al. XCES: An XML-based Encoding Standard for Linguistic Corpora , 2000, LREC.
[6] Nelleke Oostdijk,et al. Variability in Dutch Tweets. An estimate of the proportion of deviant word tokens , 2014, J. Lang. Technol. Comput. Linguistics.
[7] Ludovic Denoyer,et al. The Wikipedia XML corpus , 2006, SIGF.
[8] Helmut Schmidt,et al. Probabilistic part-of-speech tagging using decision trees , 1994 .
[9] Thomas Bartz,et al. Optimierung des Stuttgart-Tübingen-Tagset für die linguistische Annotation von Korpora zur internetbasierten Kommunikation: Phänomene, Herausforderungen, Erweiterungsvorschläge , 2013, J. Lang. Technol. Comput. Linguistics.
[10] Noah Bubenhofer,et al. A comparable Wikipedia corpus: from wiki syntax to POS tagged XML , 2011 .
[11] Dirk Riehle,et al. Design and implementation of the Sweble Wikitext parser: unlocking the structured data of Wikipedia , 2011, Int. Sym. Wikis.
[12] Bryan Ford,et al. Parsing expression grammars: a recognition-based syntactic foundation , 2004, POPL '04.
[13] Angelika Storrer,et al. A TEI Schema for the Representation of Computer-mediated Communication , 2012 .
[14] Diana Inkpen,et al. Segmentation Similarity and Agreement , 2012, NAACL.
[15] Oliver Ferschke,et al. Behind the Article: Recognizing Dialog Acts in Wikipedia Talk Pages , 2012, EACL.
[16] Marc Kupietz,et al. Recent Developments in DeReKo , 2014, LREC.
[17] Valentin Jijkoun,et al. Overview of the WiQA Task at CLEF 2006 , 2006, CLEF.
[18] Nancy Ide,et al. Corpues enconding standard: SGML guidelines for encoding linguistic corpora , 1998, LREC.
[19] Iryna Gurevych,et al. A Corpus-Based Study of Edit Categories in Featured and Non-Featured Wikipedia Articles , 2012, COLING.
[20] Harald Lüngen,et al. A TEI P5 Document Grammar for the IDS Text Model , 2012 .
[21] Gjergji Kasneci,et al. YAWN: A Semantically Annotated Wikipedia XML Corpus , 2007, BTW.