Information fusion experiments for text classification

We summarize our experiments and results in employing information fusion for automatic classification of free text documents into a given number of categories. We try to characterize this information fusion work in terms of the Joint Directors of Laboratories scheme. The text used in the experiments is taken from the Reuters-22173 collection, which not only comes pre-analyzed, but facilitates training of the neural networks, as well as evaluation of the classification decisions. We use different kinds of feature extractors to derive information from documents, and use neural networks for both learning and fusion. We compare the effectiveness of individual feature extractors in classifying the text with that of information fusion from different interesting combinations of feature extractors. The results indicate that information fusion almost always performs better than the individual feature extractors, and certain combinations seem to do better than the others. Additional parameters can have varying degrees of effectiveness, and remain to be investigated.