Morpho-syntactic tagging system based on the patterns words for arabic texts

Text tagging is a very important tool for various applications in natural language processing, namely the morphological and syntactic analysis of texts, indexation and information retrieval, "vocalization" of Arabic texts, and probabilistic language model (n-class model). However, these systems based on the lexemes of limited size, are unable to treat unknown words consequently. To overcome this problem, we developed in this paper, a new system based on the patterns of unknown words and the hidden Markov model. The experiments are carried out in the set of labeled texts, the set of 3800 patterns, and the 52 tags of morpho-syntactic nature, to estimate the parameters of the new model HMM.