Turkish and its challenges for language processing

We present a short survey and exposition of some of the important aspects of Turkish that have proven challenging for natural language processing. Most of the challenges stem from the complex morphology of Turkish and how morphology interacts with syntax. We also provide a short overview of the major tools and resources developed for Turkish natural language processing over the last two decades.

[1]  Kemal Oflazer,et al.  Erratum: Dependency Parsing of Turkish , 2008, CL.

[2]  Ruken Cakici,et al.  Annotating Subordinators in the Turkish Discourse Bank , 2009, Linguistic Annotation Workshop.

[3]  Sabine Buchholz,et al.  CoNLL-X Shared Task on Multilingual Dependency Parsing , 2006, CoNLL.

[4]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[5]  Lauri Karttunen,et al.  Finite State Morphology , 2003, CSLI Studies in Computational Linguistics.

[6]  Dilek Z. Hakkani-Tür,et al.  Building a Turkish Treebank , 2003 .

[7]  Kemal Oflazer,et al.  Error-tolerant Finite-state Recognition with Applications to Morphological Analysis and Spelling Correction , 1995, CL.

[8]  Kemal Oflazer Statistical Machine Translation into a Morphologically Complex Language , 2008, CICLing.

[9]  Murat Saraclar,et al.  Morphological Disambiguation of Turkish Text with Perceptron Algorithm , 2009, CICLing.

[10]  Deniz Yuret,et al.  Learning Morphological Disambiguation Rules for Turkish , 2006, NAACL.

[11]  Kemal Oflazer,et al.  The architecture and the implementation of a finite state pronunciation lexicon for Turkish , 2006, Comput. Speech Lang..

[12]  Ilknur Durgar El-Kahlout A prototype English-Turkish statistical machine translation system , 2009 .

[13]  Gökhan Tür,et al.  Combining Hand-crafted Rules and Unsupervised Learning in Constraint-based Morphological Disambiguation , 1996, EMNLP.

[14]  Joakim Nivre,et al.  MaltParser: A Language-Independent System for Data-Driven Dependency Parsing , 2007, Natural Language Engineering.

[15]  Kemal Oflazer,et al.  Statistical Dependency Parsing for Turkish , 2006, EACL.

[16]  Kemal Oflazer,et al.  Tagging and Morphological Disambiguation of Turkish Text , 1994, ANLP.

[17]  Murat Saraclar,et al.  Resources for Turkish morphological processing , 2011, Lang. Resour. Evaluation.

[18]  Miriam Butt,et al.  The Parallel Grammar Project , 2002, COLING 2002.

[19]  Hakan Yilmazer,et al.  Construction of the Turkish National Corpus (TNC) , 2012, LREC.

[20]  Kemal Oflazer,et al.  Two-level Description of Turkish Morphology , 1993, EACL.

[21]  Kemal Oflazer,et al.  Integrating derivational morphology into syntax , 2009 .

[22]  Kemal Oflazer,et al.  Syntax-to-Morphology Mapping in Factored Phrase-Based Statistical Machine Translation from English to Turkish , 2010, ACL.

[23]  Kemal Oflazer,et al.  Building a wordnet for Turkish , 2004 .

[24]  Gökhan Tür,et al.  Statistical Morphological Disambiguation for Agglutinative Languages , 2000, COLING.

[25]  Kemal Oflazer,et al.  Exploiting Morphology and Local Word Reordering in English-to-Turkish Phrase-Based Statistical Machine Translation , 2010, IEEE Transactions on Audio, Speech, and Language Processing.