Automatic knowledge extraction from manufacturing research publications

Knowledge mining is a young and rapidly growing discipline aiming at automatically identifying valuable knowledge in digital documents. This paper presents the results of a study of the application of document retrieval and text mining techniques to extract knowledge from CIRP research papers. The target is to find out if and how such tools can help researchers to find relevant publications in a cluster of papers and increase the citation indices their own papers. Two different approaches to automatic topic identification are investigated. One is based on Latent Dirichlet Allocation of a huge document set, the other uses Wikipedia to discover significant words in papers. The study uses a combination of both approaches to propose a new approach to efficient and intelligent knowledge mining.