Textual and chemical information processing: different domains but similar algorithms

This paper discusses the extent to which algorithms developed for the processing of textual databases are also applicable to the processing of chemical structure databases, and vice versa. Applications discussed include: an algorithm for distribution sorting that has been applied to the design of screening systems for rapid chemical substructure searching; the use of measures of inter-molecular structural similarity for the analysis of hypertext graphs; a genetic algorithm for calculating term weights for relevance feedback searching for determining whether a molecule is likely to exhibit biological activity; and the use of data fusion to combine the results of different chemical similarity searches.