Building lexical tools to manage information written in Spanish

Most of the largest operating systems and information retrieval tools do not provide lexical tools for Spanish, which makes it very difficult to check the information fed to the systems. This problem is especially critical in big organisations (libraries, museums, etc) where information is acquired mechanically by scanning or typing, which adds to the already existing errors, new errors caused by the mechanisation process. The need to filter large amounts of information written in Spanish led us to build lexical tools for the Spanish language. This paper presents COES, a complete environment that allows the user to deal with Spanish grammatical problems. Special emphasis has been made on formal specification of Spanish grammar, word tagging and dictionary generation. Finally, some evaluation results of the spelling services are shown. COES has been freely distributed since the end of 1994.

[1]  Jesus Carretero,et al.  A formal approach to spanish grammar: the COES tools , 1996 .

[2]  José Miguel Goñi Menoyo,et al.  ARIES: A ready for use platform for engineering Spanish-processing tools , 1995 .

[3]  Fernando Sánchez León,et al.  Development of a Spanish Version of the Xerox Tagger , 1995, ArXiv.

[4]  J. D. Bovey Building a thesaurus for a collection of cartoon drawings , 1995, J. Inf. Sci..

[5]  Jean-Pierre Chanod,et al.  Tagging French - comparing a statistical and a constraint-based method , 1995, EACL.

[6]  José Miguel Goñi-Menoyo,et al.  ARIES: A ready for use platform for engineering Spanish-processing tools , 1995 .

[7]  Atro Voutilainen,et al.  Tagging accurately - Don't guess if you know , 1994, ANLP.

[8]  Bernard Mérialdo,et al.  Tagging English Text with a Probabilistic Model , 1994, CL.

[9]  J. Hallebeek Morfología y sintaxis del español : introducción al análisis oracional , 1994 .

[10]  Jesus Carretero,et al.  Building a Spanish Speller , 1994 .

[11]  Mark Liberman,et al.  A Finite-State Morphological Processor For Spanish , 1990, COLING.

[12]  Gerard Salton,et al.  Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[13]  C. Chapelle The Computational Analysis of English—A Corpus‐Based Approach , 1988 .

[14]  M. Meya Morphological Analysis of Spanish for Retrieval , 1987 .

[15]  Bernard Comrie,et al.  The World's Major Languages , 1987 .

[16]  Martí Antonin,et al.  Un sistema de análisis morfológico por ordenador , 1986 .

[17]  Susanna Cumming,et al.  Designing a Computerized Lexicon for Linguistic Purposes , 1986 .

[18]  Bernard Mérialdo,et al.  Natural Language Modeling for Phoneme-to-Text Transcription , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Frederick Jelinek,et al.  Markov Source Modeling of Text Generation , 1985 .

[20]  Comisión de Gramática Esbozo de una nueva gramática de la lengua española , 1973 .

[21]  J. A. Bolúfer,et al.  Diccionario de la lengua española , 1917 .