INTEX: An FST Toolbox

INTEX is an integrated Natural Language Processing toolbox based on finite state transducers (FSTs). It parses texts of several million words, and includes large-coverage dictionaries and grammars. Tcxts, Dictionaries and Grammars are represented internally by FSTs. The user may add his/her own dictionaries and grammars; these tools are applied to texts in order to locate lexical and syntactic patterns, remove ambiguities, and tag simple words as well as complex utterances. INTEX builds lemmatized concordances and indices of texts with respect to all types of finite state patterns; it is used as a lexical parser to produce the input of a syntactic parser, but can also be vicwed as an information retrieval system.