Visualization Techniques to Explore Data Mining Results for Document Collections

Data mining has informally been introduced as large scale search for interesting patterns in data. It is often an explorative task iteratively performed within the process of knowledge discovery in databases. In this process, interactive visualization techniques are also successfully applied for data exploration. We deal with the synergy of these two complemental approaches. Whereas datamining typically relies on strategies for systematic search in large hypotheses spaces guided by the autonomous evaluation of statistical tests, interactive visualization activates the visual capacities of an analyst to identify patterns that may also stimulate the further direction of the exploration process. We demonstrate some possibilities to combine these approaches for the area of data mining in document collections. Document Explorer is a system that offers various preprocessing tools to prepare collections of text or multimedia documents which are available in distributed environments (e.g. Internet and Intranet) for data mining applications, and includes data mining methods based on searching for patterns like frequent sets or association rules. Keyword graphs are used in this system as an highly interactive technique to present the mining results. The user can operate on the visualized results, either to redirect the data mining process, to filter and structure the results, to link several graphs, or to browse into the document collection. Thus in the keyword graphs, the relations between interesting sets of keywords are presented (the sets may also be regarded as retrieval queries to be posed to the collection) and made operable to the analyst.