Alignement de textes bilingues par classification ascendante hiérarchique

Existing translations contain a wealth of ready-made solutions that can be reused to generate new high-quality translations. For this reason, translation resources are frequently stored in electronic databases providing certain information retrieval facilities. The c oncept of bilingual t ext alignment enables a more e fficient use of the translation resources, by reconstructing the links maintaining translation equivalence between the c orresponding segments of the text and its translations in different languages. Current t ext alignment algorithms perform quite successfully on a sentence level. However, there is a need to continue research in finer-grained text alignment. In this regard, we propose to identify translation correspondences on the basis of hierarchical cluster analysis of graphical forms and repeated segments of bilingual texts. The principles of this technique enable to yield, through progressive agglomeration, clusters of textual units with similar ( or identical) distributional profiles. The results obtained following this technique suggest that hierarchical cluster analysis can be applied for a wide rage of purposes in bilingual text alignment.