Dense Word Representation Utilization in Indonesian Dependency Parsing

Available Indonesian dependency parsers can be considered worse than other languages’ parsers that have been researched thoroughly. Currently, Indonesia dependency parsers can’t reliably parse sentences with gerund(s) and/or ellipsis correctly. This is because of the sparse feature representation that causes difficulty in parsing these types of sentences. In this research, dense representation is proposed for Indonesian dependency parser. The use of dense word representation may allow better generalization and gives more information regarding the words to be parsed, which allows a more accurate parsing. The scope of the dependency parsing in this research is limited to well-formed Indonesian sentences, using the local transition-based parsing. Based on our experiments, we found that using word embedding instead of sparse word representation increases parsing accuracy significantly.