Information Fusion for Scientific Literature Classification

Scientific literatures can be organized to serve as a roadmap for researchers by pointing where and when the scientific community has been and is heading to. They present historic and current state-of-the-art knowledge in the interesting areas of study. They also document valuable information including author lists, affiliated institutions, citation information, keywords, etc., which can be used to extract further information that will assist in analyzing their content and relationship with one another. However, their tremendously growing size and the increasing diversity of research fields have become a major concern, especially for organization, analysis, and exploration of such documents. This chapter proposes an automatic scientific literature classification method (ASLCM) that makes use of different information extracted from the literatures to organize and present them in a structured manner. In the proposed ASLCM, multiple similarity information is extracted from all available sources and fused to give an optimized and more meaningful classification through using a genetic algorithm. The final result is used to identify the different research disciplines within the collection, their emergence and termination, major collaborators, centers of excellence, their influence, and the flow of information among the multidisciplinary research areas.