Apprentissage artificiel de règles d’indexation pour MEDLINE

Indexing is a crucial step in any information retrieval system. In MEDLINE, a widely used database of the biomedical literature, the indexing process involves the selection of Medical Subject Headings in order to describe the subject matter of articles. The need for automatic tools to assist human indexers in this task is growing with the increasing amount of publications to be referenced in MEDLINE. In this paper, we describe the use and the customiza- tion of Inductive Logic Programming (ILP) to infer indexing rules that may be used to produce automatic indexing recommendations for MEDLINE indexers. Our results show that this origi- nal ILP-based approach overperforms manual rules when they exist. We expect the sets of ILP rules obtained in this experiment to be integrated in the system producing automatic indexing recommendations for MEDLINE.

[1]  Stéfan Jacques Darmoni,et al.  Automatic indexing of online health resources for a French quality controlled gateway , 2006, Inf. Process. Manag..

[2]  Luc De Raedt,et al.  Inductive Logic Programming: Theory and Methods , 1994, J. Log. Program..

[3]  Marek Reformat,et al.  Multilabel associative classification categorization of MEDLINE articles into MeSH keywords. , 2007, IEEE engineering in medicine and biology magazine : the quarterly magazine of the Engineering in Medicine & Biology Society.

[4]  Susanne M. Humphrey,et al.  The NLM Indexing Initiative's Medical Text Indexer , 2004, MedInfo.

[5]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[6]  Thomas C. Rindflesch,et al.  Multiple Approaches to Fine-Grained Indexing of the Biomedical Literature , 2006, Pacific Symposium on Biocomputing.

[7]  Jimmy J. Lin,et al.  PubMed related articles: a probabilistic topic-based model for content similarity , 2007, BMC Bioinformatics.

[8]  Vincent Claveau,et al.  Learning Semantic Lexicons from a Part-of-Speech and Semantically Tagged Corpus Using Inductive Logic Programming , 2003, J. Mach. Learn. Res..

[9]  Wray L. Buntine Generalized Subsumption and Its Applications to Induction and Redundancy , 1986, Artif. Intell..

[10]  Vincent Claveau,et al.  Inférence de règles de propagation syntaxique pour l'alignement de mots , 2006, Trait. Autom. des Langues.

[11]  Stefan Schulz,et al.  Cross-language MeSH Indexing using Morpho-Semantic Normalization , 2003, AMIA.