Automatic TIMEX2 tagging of Korean news

This article reports on a temporal tagger for Korean based on a Korean extension of the TIDES TIMEX2 guidelines. The extension, which primarily addresses the idiosyncrasies of Korean morphology, shows high inter-annotator reliability (0.893 F-measure for tag extent) when applied to a corpus of Korean newspaper articles. A machine-learning approach based on rote learning from a human-edited, automatically-derived dictionary of temporal expressions is compared with a second approach that adds manual patterns, and a third onethat tries to learn the patterns. Results for the first two are promising (0.87 F-measure for tag extent). Overall, the article shows that rote learning approaches can be very useful when language-specific features such as morphology are taken into account.