Recognition and normalization of some classes of named entities in Serbian

In this paper we present a system for recognition and normalization of measurement and money expressions and temporal expressions for dates and time in Serbian newspaper texts. The normalization of amount expressions involves a transformation of used numerals to a fixed-point notation as well as a transformation of currencies and measurement units into their standard or common abbreviations, while temporal expressions are transformed into the TimeML format. For this purpose, we use our general lexical resources and develop some new ones. The system itself consists of a large collection of finite-state transducers. Finally, we give some evaluation data that show that our system performs well, with well-balanced precision and recall.