W-efficient partitions and the solution of the sequential clustering problem

Clustering involves partitioning a set of related objects into a set of mutually exclusive and completely exhaustive clusters. The objective is to form clusters which reflect minimum difference among objects as measured by the relevant clustering criterion. Most statements of clustering problems assume that the number of clusters, g, in the partition is known. In reality, a value for g may not be immediately obvious. It is known that as g increases, there is an improvement in the value of the clustering criterion function. However, for some values of g, this rate of improvement may be less than expected. Because there may be a cost factor involved, there is also interest in identifying those values of g that offer attractive rates of improvement. Partitions that are optimal for a given g, and for which the given g offer an attractive rate of improvement, are referred to as being w-efficient; other partitions, even if optimal for a given g, are referred to as being w-inefficient. We present a linear programming approach for generating the w-efficient partitions of the sequential clustering problem, and demonstrate the importance of w-efficient partitions to the efficient solution of the sequential clustering problem.