DESIGN AND EVALUATION OF A WEB SYSTEM SUPPORTING VARIOUS TEXT MINING TASKS FOR THE PURPOSES OF EDUCATION AND RESEARCH

This paper presents an original solution that offers necessary functionalities for design, implementation or simple evaluation of various text mining techniques based on Java library called JBOWL. This library was designed as open source API to support different phases of the whole text mining process and offers a wide range of relevant classification and clustering algorithms. JBOWL is particularly useful for enhancing existing software applications with text mining capabilities, as well as for support of practical education of text mining and its exploitation. In this paper we present two particular cases where JBOWL has been successfully integrated and tailored for specific way of exploitation. First case presents integration of JBOWL within collaborative application called KP-Lab System and the second one is a web-based system for education purposes. The proposed solution supports the whole text mining process, starting from creation of a corpus of relevant documents, application of various pre-processing methods, up to creation of text mining models in a form of classifiers and evaluation of the obtained models. The execution of different tasks in the same time is supported by task-based execution engine, which provides middleware-like transparent layer for distributed execution. Evaluation of developed solution was realized within the university course called Knowledge management. This course is organized at the Department of Cybernetics and Artificial Intelligence, Faculty of Electrical Engineering and Informatics, Technical University of Kosice. The paper also describes performed experiments and their results.