Les supports de vocabulaires pour les systèmes de recherche d'information orientés précision : application aux graphes pour la recherche d'information médicale. (Vocabulary supports for precision oriented information retrieval systems: application to graphs for medical information retrieval)

This Ph. D. Explores a framework for the development of precision-oriented information retrieval (IR) models. This framework promotes the notion of vocabulary support to model expressive representations used by IR systems. Indeed few modelling framework are available to specify IR systems. We propose such a framework which focuses on the modelling of expressiveness. This framework can be used to choose or to compare models on their level of expressiveness. In this framework we are moving towards the use of an expressive representation of the text. For this, we propose two models that are using representations with strong expressiveness. Both models are based on graphs. Through these two models are similar on their expressiveness, they are opposed on their underlying models. Indeed, we implement our first model with a model derived from conceptual graphs, and the second one with a model derived from the language modelling approach to IR. To use these models on text, we propose the use of a two-step process based on language processing that promotes information coverage. The first step produces an intermediate representation of documents in which each sentence is represented by a graph. This step is domain dependent. The second step creates documents final representations from the intermediate one. We finally apply our two models on the medical domain. To do so, we use the meta-thesaurus UMLS and we propose several ways to build the intermediate representation of documents. The effectiveness of our model is proven by a number of experiments on the CLEF medical campaign. This campaign enables us to test our models in a real framework and to compare it to other teams.