Should Have, Would Have, Could Have. Investigating Verb Group Representations for Parsing with Universal Dependencies.

Treebanks have recently been released for a number of languages with the harmonized annotation created by the Universal Dependencies project. The representation of certain constructions in UD are known to be suboptimal for parsing and may be worth transforming for the purpose of parsing. In this paper, we focus on the representation of verb groups. Several studies have shown that parsing works better when auxiliaries are the head of auxiliary dependency relations which is not the case in UD. We therefore transformed verb groups in UD treebanks, parsed the test set and transformed it back, and contrary to expectations, observed significant decreases in accuracy. We provide suggestive evidence that improvements in previous studies were obtained because the transformation helps disambiguating POS tags of main verbs and auxiliaries. The question of why parsing accuracy decreases with this approach in the case of UD is left open.