Context Analysis for Semantic Mapping of Data Sources Using a Multi-Strategy Machine Learning Approach

Be it on a webwide or inter-entreprise scale, data integration has become a major necessity urged by the expansion of the Internet and of its widespread use for communication between business actors. However, since data sources are often heterogeneous, their integration remains an expensive procedure. Indeed, this task requires prior semantic alignment of all the data sources concepts. Doing this alignment manually is quite laborious especially if there is a large number of concepts to be matched. Various solutions have been proposed attempting to automatize this step. This paper introduces a new framework for data sources alignment which integrates context analysis to multi-strategy machine learning. Although their adaptability and extensibility are appreciated, actual machine learning systems often suffer from the low quality and the lack of diversity of training data sets. To overcome this limitation, we introduce a new notion called “informational context” of data sources. We therefore briefly explain the architecture of a context analyser to be integrated into a learning system combining multiple strategies to achieve data source mapping.