Manual Clustering Refinement using Interaction with Blobs

The huge amount of different automatic clustering methods emphasizes one thing: there is no optimal clustering method for all possible cases. In certain application domains, like genomics and natural language processing, it is not even clear if any of the already known clustering methods suffice. In such cases, an automatic clustering method is often followed by manual refinement. The refined version may then be used as either an illustration, a reference, or even an input for a rule based or other machine learning algorithm as a new clustering method. In this paper, we describe a novel interaction technique to manual cluster refinement using the metaphor of soap bubbles, represented by special implicit surfaces (blobs). For instance, entities can simply be moved inside and outside of these blobs. A modified force-directed layout process automatically arranges entities equidistant on the screen. The modifications include a reduction to the expected amount of computation per iteration down to O(|V| log |V|+|E|) in order to achieve a high response time for use in an interactive system. We also spend a considerable amount of effort making the display of blobs fast enough for an interactive system.