Clustering Sentence-Level Text Using a Fuzzy Back- Propagation Clustering Algorithm

In comparison with hard clustering methods, in which a pattern belongs to a unique cluster, clustering algorithms with fuzziness allow patterns with differing degrees of membership to belong to all clusters. This is important in domains such as sentence clustering, as a sentence may belong to more than a topic present within a document or set of documents. Since most sentence similarity measures do not represent sentences in a common metric space, traditional fuzzy clustering approaches are generally not applicable to sentence clustering. This paper presents a back propagation fuzzy clustering algorithm. The algorithm uses a graph representation of the data, and operates in an Back Propagation framework in which the graph centrality of an object in the graph is interpreted as a likelihood. Results of applying the algorithm to sentence clustering tasks demonstrate that the algorithm is suitable of identifying more clusters of related sentences, and that it is therefore of potential use in a variety of text mining tasks. Keywords— Sentence Clustering, Fuzzy clusters, Back Propagation, Page ranks, Membership values