Outil d’aide à la fouille documentaire : approche hybride numérique linguistique

Computer-aided Knowledge extraction from extensive textual data is generally based on two methods, the statistical and the linguistic. These two methods were, till recently, considered by specialists as divergent. In this article we endeavor to show that they are rather complementary. If we focus on the two methods we can say that the first one tends towards processing extensive data but the results are rather rough while the second one, based on semantic analyses yields better results. It allows moreover a structured knowledge representation. The idea is to combine these two strategies in an integrated production process in order to have better results. Two systems can be considered : CONTERM, a connectionist-based model and SEEK, a semantic-based one. The integration of SEEK to CONTERM is considered in the framework of FRANCIL (Francophone Language Engineering Network). This research project is conducted in collaboration between LANCI, University of Quebec in Montreal, CAMS, Paris IV and CREDO/IDIST, Lille III.