Using Stanford Part-of-Speech Tagger for the Morphologically-rich Filipino Language

This research focuses on the implementation of a Maximum Entropy-based Part-of-Speech (POS) tagger for Filipino. It uses the Stanford POS tagger – a trainable POS tagger that has been trained on English, Chinese, Arabic, and other languages and producing one of the highest results in each language. The tagger was trained for Filipino using a 406k token corpus and considering unique Filipino linguistic phenomena such as high morphology and intra-sentential code-switches. The Filipino POS tagger resulted to 96.15% tagging accuracy which currently presents the highest accuracy and with a large lead among existing POS taggers for Filipino.

[1]  Nicco Nocon,et al.  SMTPOST Using Statistical Machine Translation Approach in Filipino Part-of-Speech Tagging , 2016, PACLIC.

[2]  Wei Ding,et al.  A Fully Unsupervised Word Sense Disambiguation Method Using Dependency Knowledge , 2009, HLT-NAACL.

[3]  Christopher D. Manning Part-of-Speech Tagging from 97% to 100%: Is It Time for Some Linguistics? , 2011, CICLing.

[4]  Mihai Surdeanu,et al.  Customizing an Information Extraction System to a New Domain , 2011, RELMS@ACL.

[5]  Matthew Phillip Go,et al.  Developing an Unsupervised Grammar Checker for Filipino Using Hybrid N-grams as Grammar Rules , 2016, PACLIC.

[6]  Jinho D. Choi Dynamic Feature Induction: The Last Gist to the State-of-the-Art , 2016, NAACL.

[7]  De La Salle,et al.  A Named Entity Recognizer for Filipino Texts , 2007 .

[8]  Pascal Denis,et al.  Coupling an Annotated Corpus and a Morphosyntactic Lexicon for State-of-the-Art POS Tagging with Less Human Effort , 2009, PACLIC.

[9]  Shirley B. Chu Language Resource Development at DLSU-NLP Lab , 2009 .

[10]  Danqi Chen,et al.  A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.

[11]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[12]  Ewan Klein,et al.  Natural Language Processing with Python , 2009 .

[14]  Eugene Charniak,et al.  Equations for Part-of-Speech Tagging , 1993, AAAI.

[15]  Rachel Edita,et al.  Comparative Evaluation of Tagalog Part-of-Speech Taggers , 2007 .