A file organization and maintenance procedure for dynamic document collections

Abstract Several techniques have been proposed for clustering document collections. However, these algorithms ignore file maintenance problems which occur whenever the collection is dynamic. This paper describes a clustering algorithm designed for dynamic data bases and presents an update procedure which maintains an effective document classification without reclustering. The effectiveness of the algorithms is demonstrated for a subset of the Cranfield collection.