Morphosyntactic Corpora and Tools for Persian

This thesis presents open source resources in the form of annotated corpora and modules for automatic morphosyntactic processing and analysis of Persian texts. More specifically, the resources cons ...

[1]  Xavier Carreras,et al.  Experiments with a Higher-Order Projective Dependency Parser , 2007, EMNLP.

[2]  Eckhard Bick Arboretum, a Hybrid Treebank for Danish , 2004 .

[3]  Noah A. Smith,et al.  Turning on the Turbo: Fast Third-Order Non-Projective Turbo Parsers , 2013, ACL.

[4]  J. Folkeryd Writing with an Attitude : Appraisal and student texts in the school subject of Swedish , 2006 .

[5]  Cristina Bosco,et al.  Dependency and relational structure in treebank annotation , 2004 .

[6]  Forogh Hashabeiky,et al.  The Usage of Singular Verbs for Inanimate Plural Subjects in Persian , 2007 .

[7]  Part of Speech Tags for Automatic Tagging and Syntactic Structures , 1998 .

[8]  Koldo Gojenola,et al.  Combining Rule-Based and Statistical Syntactic Analyzers , 2012, SPMRL@ACL 2012.

[9]  Kimmo Koskenniemi,et al.  Two-Level Model for Morphological Analysis , 1983, IJCAI.

[10]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[11]  Scott Deerwester,et al.  English in computer science : a corpus-based lexical analysis , 1994 .

[12]  Atro Voutilainen,et al.  A language-independent system for parsing unrestricted text , 1995 .

[13]  Michael A. Covington,et al.  A Fundamental Algorithm for Dependency Parsing , 2004 .

[14]  Mohammad Sadegh Rasooli,et al.  Development of a Persian Syntactic Dependency Treebank , 2013, NAACL 2013.

[15]  Slav Petrov,et al.  A Universal Part-of-Speech Tagset , 2011, LREC.

[16]  Joakim Nivre,et al.  Analyzing and Integrating Dependency Parsers , 2011, CL.

[17]  Joakim Nivre,et al.  Universal Stanford dependencies: A cross-linguistic typology , 2014, LREC.

[18]  Koby Crammer,et al.  Online Large-Margin Training of Dependency Parsers , 2005, ACL.

[19]  Stefan Evert,et al.  Is Part-of-Speech Tagging a Solved Task? An Evaluation of POS Taggers for the German Web as Corpus , 2009 .

[20]  Bernd Bohnet,et al.  Top Accuracy and Fast Dependency Parsing is not a Contradiction , 2010, COLING.

[21]  Igor Boguslavsky,et al.  Dependency Treebank for Russian: Concept, Tools, Types of Information , 2000, COLING.

[22]  Joakim Nivre,et al.  Algorithms for Deterministic Incremental Dependency Parsing , 2008, CL.

[23]  Jörg Tiedemann Recycling Translations : Extraction of Lexical Data from Parallel Corpora and their Application in Natural Language Processing , 2003 .

[24]  Christopher D. Manning,et al.  The Stanford Typed Dependencies Representation , 2008, CF+CDPE@COLING.

[25]  Joakim Nivre,et al.  Non-Projective Dependency Parsing in Expected Linear Time , 2009, ACL.

[26]  Mahmood Bijankhan,et al.  Lessons from building a Persian written corpus: Peykare , 2011, Lang. Resour. Evaluation.

[27]  Nancy Ide,et al.  Standardised specifications, development and assessment of large morpho-lexical resources for six central and eastern european languages , 1998, LREC.

[28]  Tomás Jelínek Improvements to Dependency Parsing Using Automatic Simplification of Data , 2014, LREC.

[29]  Brigham Young The Corpus of Contemporary American English as the first reliable monitor corpus of English , 2010 .

[30]  Stephen Clark,et al.  A Tale of Two Parsers: Investigating and Combining Graph-based and Transition-based Dependency Parsing , 2008, EMNLP.

[31]  Joakim Nivre,et al.  Universal Dependency Annotation for Multilingual Parsing , 2013, ACL.

[32]  Wirote Aroonmanakun Thoughts on Word and Sentence Segmentation in Thai , 2007 .

[33]  Jay Earley,et al.  An efficient context-free parsing algorithm , 1970, Commun. ACM.

[34]  Jan Hajic,et al.  Prague Czech-English Dependency Treebank. Syntactically Annotated Resources for Machine Translation , 2004, LREC.

[35]  Giorgio Satta,et al.  Guided Learning for Bidirectional Sequence Classification , 2007, ACL.

[36]  Oscar Täckström,et al.  Predicting Linguistic Structure with Incomplete and Cross-Lingual Supervision , 2013 .

[37]  Nancy Ide,et al.  XCES: An XML-based Encoding Standard for Linguistic Corpora , 2000, LREC.

[38]  Lluís Màrquez i Villodre,et al.  SVMTool: A general POS Tagger Generator Based on Support Vector Machines , 2004, LREC.

[39]  Ingrid Björk Relativizing Linguistic Relativity: Investigating Underlying Assumptions About Language in the Neo-whorfian Literature , 2008 .

[40]  Sabine Buchholz,et al.  CoNLL-X Shared Task on Multilingual Dependency Parsing , 2006, CoNLL.

[41]  Farhad Oroumchian,et al.  Creating a Feasible Corpus for Persian POS Tagging , 2007 .

[42]  Joakim Nivre,et al.  Resourceful Language Technology : Festschrift in Honor of Anna Sågvall Hein , 2008 .

[43]  Díaz de Ilarraza Construction of a Basque Dependency Treebank , 2003 .

[44]  Guy Aston,et al.  The BNC Handbook: Exploring the British National Corpus with SARA , 1998 .

[45]  Marie Mikulová,et al.  Prague Dependency Treebank , 2017 .

[46]  Daniel H. Younger,et al.  Recognition and Parsing of Context-Free Languages in Time n^3 , 1967, Inf. Control..

[47]  András Kornai,et al.  HunPos: an open source trigram tagger , 2007, ACL 2007.

[48]  Michael Collins,et al.  Efficient Third-Order Dependency Parsers , 2010, ACL.

[49]  Eric Brill,et al.  Transformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in Part-of-Speech Tagging , 1995, CL.

[50]  Joakim Nivre,et al.  A Transition-Based System for Joint Part-of-Speech Tagging and Labeled Non-Projective Dependency Parsing , 2012, EMNLP.

[51]  Dan Klein,et al.  Parsing German with Latent Variable Grammars , 2008 .

[52]  M. Maamouri,et al.  The Penn Arabic Treebank: Building a Large-Scale Annotated Arabic Corpus , 2004 .

[53]  Mattias Nilsson,et al.  Computational Models of Eye Movements in Reading : A Data-Driven Approach to the Eye-Mind Link , 2012 .

[54]  Nancy Priest-Dorman Greg Ide,et al.  Corpus Encoding Standard (CES) , 2000 .

[55]  Hwee Tou Ng,et al.  A Unified Tagging Approach to Text Normalization , 2007, ACL.

[56]  Sebastian Riedel,et al.  The CoNLL 2007 Shared Task on Dependency Parsing , 2007, EMNLP.

[57]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[58]  Yvonne Adesam,et al.  The Multilingual Forest : Investigating High-quality Parallel Corpus Development , 2012 .

[59]  Jonas Kuhn,et al.  Converting an HPSG-based Treebank into its Parallel Dependency-based Treebank , 2014, LREC.

[60]  Joakim Nivre,et al.  MaltOptimizer: A System for MaltParser Optimization , 2012, LREC.

[61]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[62]  Tadao Kasami,et al.  An Efficient Recognition and Syntax-Analysis Algorithm for Context-Free Languages , 1965 .

[63]  Agnes Edling,et al.  Abstraction and authority in textbooks : The textual paths towards specialized language , 2006 .

[64]  Joakim Nivre,et al.  Incrementality in Deterministic Dependency Parsing , 2004 .

[65]  Gilbert Lazard,et al.  A grammar of contemporary Persian , 1994 .

[66]  Farhad Oroumchian,et al.  Evaluation of part of speech tagging on Persian text , 2007 .

[67]  Miriam Butt,et al.  Urdu Ezafe and the Morphology-Syntax Interface , 2008 .

[68]  Dilek Z. Hakkani-Tür,et al.  Cross-linguistic analysis of prosodic features for sentence segmentation , 2007, INTERSPEECH.

[69]  Noah A. Smith,et al.  Dependency Parsing , 2009, Encyclopedia of Artificial Intelligence.

[70]  Fernando Pereira,et al.  Online Learning of Approximate Dependency Parsing Algorithms , 2006, EACL.

[71]  Azadeh Shakery,et al.  Creating a Persian-English Comparable Corpus , 2010, CLEF.

[72]  Rebecca Hwa,et al.  Sample Selection for Statistical Parsing , 2004, CL.

[73]  Masood Ghayoomi Bootstrapping the Development of an HPSG-based Treebank for Persian , 2012 .

[74]  Marko Tadić,et al.  The MULTEXT-East Morphosyntactic Specification for Slavic Languages , 2003 .

[75]  Jonas Kuhn,et al.  The Best of Both Worlds – A Graph-based Completion Model for Transition-based Parsers , 2012, EACL.

[76]  Masoud Rahgozar,et al.  Hamshahri: A standard Persian text collection , 2009, Knowl. Based Syst..

[77]  Tapio Salakoski,et al.  Building the essential resources for Finnish: the Turku Dependency Treebank , 2013, Language Resources and Evaluation.

[78]  Daniel Dominic Sleator,et al.  Parsing English with a Link Grammar , 1995, IWPT.

[79]  Gurpreet Singh Josan,et al.  Part of Speech Taggers for Morphologically Rich Indian Languages: A Survey , 2010 .

[80]  Chu-Ren Huang,et al.  Sinica Treebank: Design Criteria, Representational Issues and Implementation , 2004 .

[81]  Reut Tsarfaty,et al.  A Unified Morpho-Syntactic Scheme of Stanford Dependencies , 2013, ACL.

[82]  Ulrika Serrander,et al.  Bilingual lexical processing in single word production : Swedish learners of Spanish and the effects of L2 immersion , 2011 .

[83]  Martin Kay,et al.  Algorithm schemata and data structures in syntactic processing , 1986 .

[84]  Åsa af Geijerstam Att skriva i naturorienterande ämnen i skolan , 2006 .

[85]  Fei Xia,et al.  The Penn Chinese TreeBank: Phrase structure annotation of a large corpus , 2005, Natural Language Engineering.

[86]  Yuji Matsumoto,et al.  Statistical Dependency Analysis with Support Vector Machines , 2003, IWPT.

[87]  Pierre Nugues,et al.  A High-Performance Syntactic and Semantic Dependency Parser , 2010, COLING.

[88]  Daniel Jurafsky,et al.  Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2009, Prentice Hall series in artificial intelligence.

[89]  Stelios Piperidis,et al.  Theoretical and Practical Issues in the Construction of a Greek Dependency Treebank , 2005 .

[90]  Daniel Zeman,et al.  Reusable Tagset Conversion Using Tagset Drivers , 2008, LREC.

[91]  Yi Zhang,et al.  Cross-Domain Dependency Parsing Using a Deep Linguistic Grammar , 2009, ACL/IJCNLP.

[92]  Bahman Baluch,et al.  Reading with and without vowels: What are the psychological consequences? , 1992 .

[93]  Behrang Q. Zadeh,et al.  Persian in MULTEXT-East Framework , 2006, FinTAL.

[94]  Simonetta Montemagni,et al.  Converting Italian Treebanks: Towards an Italian Stanford Dependency Treebank , 2013, LAW@ACL.

[95]  Markus Saers,et al.  Translation as Linear Transduction : Models and Algorithms for Efficient Learning in Statistical Machine Translation , 2011 .

[96]  M. Trautner,et al.  The Danish Dependency Treebank and the DTAG Treebank Tool , 2003 .

[97]  Farideh Okati,et al.  The Vowel Systems of Five Iranian Balochi Dialects , 2012 .

[98]  Chu-Ren Huang,et al.  Sinica Treebank: Design Criteria, Annotation Guidelines, and On-line Interface , 2000, ACL 2000.

[99]  Jason Eisner,et al.  Three New Probabilistic Models for Dependency Parsing: An Exploration , 1996, COLING.

[100]  Joakim Nivre,et al.  An Improved Oracle for Dependency Parsing with Online Reordering , 2009, IWPT.

[101]  Anoop Sarkar,et al.  Syntax and Parsing , 2011 .

[102]  Luying Wang Second Language Acquisition of Mandarin Aspect Markers by Native Swedish Adults , 2012 .

[103]  Simin Karimi,et al.  Word order and scrambling , 2003 .

[104]  Mariona Taulé,et al.  AnCora: Multilevel Annotated Corpora for Catalan and Spanish , 2008, LREC.

[105]  Eric P. Xing,et al.  Turbo Parsers: Dependency Parsing by Approximate Variational Inference , 2010, EMNLP.

[106]  Saso Dzeroski,et al.  Towards a Slovene Dependency Treebank , 2006, LREC.

[107]  Daniel Jurafsky,et al.  Discriminative Reordering with Chinese Grammatical Relations Features , 2009, SSST@HLT-NAACL.

[108]  Forogh Hashabeiky,et al.  Persian orthography : modification or changeover? (1850-2000) , 2005 .

[109]  Joakim Nivre,et al.  An Efficient Algorithm for Projective Dependency Parsing , 2003, IWPT.

[110]  Fernando Pereira,et al.  Non-Projective Dependency Parsing using Spanning Tree Algorithms , 2005, HLT.

[111]  Josef van Genabith,et al.  Parser-Based Retraining for Domain Adaptation of Probabilistic Generators , 2008, INLG.

[112]  Jan Hajic,et al.  Prague Arabic Dependency Treebank: Development in Data and Tools , 2004 .

[113]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[114]  Adwait Ratnaparkhi,et al.  A Maximum Entropy Model for Part-Of-Speech Tagging , 1996, EMNLP.

[115]  Dilek Z. Hakkani-Tür,et al.  Building a Turkish Treebank , 2003 .

[116]  Nancy Ide,et al.  The MULTEXT East corpus , 1998, LREC.

[117]  Mojgan Seraji,et al.  Bootstrapping a Persian Treebank , 2012 .

[118]  Noah A. Smith,et al.  Dual Decomposition with Many Overlapping Components , 2011, EMNLP.

[119]  Thorsten Brants,et al.  TnT – A Statistical Part-of-Speech Tagger , 2000, ANLP.

[120]  Sabine Brants,et al.  The TIGER Treebank , 2001 .

[121]  Joakim Nivre,et al.  MaltParser: A Data-Driven Parser-Generator for Dependency Parsing , 2006, LREC.

[122]  H. Kucera,et al.  Computational analysis of present-day American English , 1967 .