Report on CLEF-2001 Experiments

For our first participation in CLEF retrieval tasks, our first objective was to define a general stopword list for various European languages (namely, French, Italian, German and Spanish) and also to suggest simple and efficient stemming procedures for them. Our second aim was to suggest a combined approach that might be implemented in order to facilitate effective access to multilingual collections.

[1]  Kui-Lam Kwok,et al.  TREC-3 Ad-Hoc, Routing Retrieval and Thresholding Experiments using PIRCS , 1994, TREC.

[2]  Piek Vossen,et al.  EuroWordNet: A multilingual database with lexical semantic networks , 1998, Springer Netherlands.

[3]  Christopher J. Fox,et al.  A stop list for general text , 1989, SIGF.

[4]  James Mayfield,et al.  A Language-Independent Approach to European Text Retrieval , 2000, CLEF.

[5]  Stephen E. Robertson,et al.  Experimentation as a way of life: Okapi at TREC , 2000, Inf. Process. Manag..

[6]  Julie Beth Lovins,et al.  Development of a stemming algorithm , 1968, Mech. Transl. Comput. Linguistics.

[7]  Douglas W. Oard,et al.  A survey of multilingual text retrieval , 1996 .

[8]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[9]  Jin Yang,et al.  The Systran NLP Browser: An Application of Machine Translation Technology in Cross-Language Information Retrieval , 1998 .

[10]  Gregory Grefenstette,et al.  Cross-Language Information Retrieval , 1998, The Springer International Series on Information Retrieval.

[11]  Susan T. Dumais,et al.  Latent Semantic Indexing (LSI) and TREC-2 , 1993, TREC.

[12]  Richard Sproat,et al.  Morphology and computation , 1992 .

[13]  Ellen M. Voorhees,et al.  The Collection Fusion Problem , 1994, TREC.

[14]  Jacques Savoy,et al.  Report on the TREC-9 Experiment: Link-based Retrieval and Distributed Collections , 2000, TREC.

[15]  Alistair Moffat,et al.  Information Retrieval Systems for Large Document Collections , 1994, TREC.

[16]  W. Bruce Croft,et al.  Searching distributed collections with inference networks , 1995, SIGIR '95.

[17]  Gregory Grefenstette,et al.  Querying across languages: a dictionary-based approach to multilingual information retrieval , 1996, SIGIR '96.

[18]  James C. French,et al.  The impact of database selection on distributed searching , 2000, SIGIR '00.

[19]  Chris Buckley,et al.  New Retrieval Approaches Using SMART: TREC 4 , 1995, TREC.

[20]  Christine D. Piatko,et al.  The JHU/APL HAIRCUT System at TREC-8 , 1999, TREC.

[21]  Jacques Savoy,et al.  A Stemming Procedure and Stopword List for General French Corpora , 1999, J. Am. Soc. Inf. Sci..