Multi-level NER for Portuguese in a CG Framework

This paper describes and evaluates a linguistically based NER system for Portuguese, based on lexico-semantical information, pattern matching and morphosyntactic, context driven Constraint Grammar rules. Preliminary Fscores for cross-domain news texts, when distinguishing six different name types, were 91.85 (raw) and 93.6 (subtyping of ready-chunked proper nouns).