Constructing Distributed Representations Using Additive Clustering

If the promise of computational modeling is to be fully realized in higher-level cognitive domains such as language processing, principled methods must be developed to construct the semantic representations used in such models. In this paper, we propose the use of an established formalism from mathematical psychology, additive clustering, as a means of automatically constructing binary representations for objects using only pair-wise similarity data. However, existing methods for the unsupervised learning of additive clustering models do not scale well to large problems. We present a new algorithm for additive clustering, based on a novel heuristic technique for combinatorial optimization. The algorithm is simpler than previous formulations and makes fewer independence assumptions. Extensive empirical tests on both human and synthetic data suggest that it is more effective than previous methods and that it also scales better to larger problems. By making additive clustering practical, we take a significant step toward scaling connectionist models beyond hand-coded examples.

[1]  Shumeet Baluja,et al.  Genetic Algorithms and Explicit Search Statistics , 1996, NIPS.

[2]  Andrew B. Kahng,et al.  A new adaptive multi-start technique for combinatorial global optimizations , 1994, Oper. Res. Lett..

[3]  H. Hojo A maximum likelihood method for additive clustering and its applications , 1983 .

[4]  Joshua B. Tenenbaum,et al.  Learning the Structure of Similarity , 1995, NIPS.

[5]  G. Cottrell,et al.  Extreme Attraction: On the Discrete Representation Preference of Attractor Networks , 1997 .

[6]  Roger N. Shepard,et al.  Additive clustering: Representation of similarities as combinations of discrete overlapping properties. , 1979 .

[7]  J. Carroll,et al.  An alternating combinatorial optimization approach to fitting the INDCLUS and generalized INDCLUS models , 1994 .

[8]  Brian W. Kernighan,et al.  An efficient heuristic procedure for partitioning graphs , 1970, Bell Syst. Tech. J..

[9]  Lamberto Cesari,et al.  Optimization-Theory And Applications , 1983 .

[10]  P. Arabie,et al.  Mapclus: A mathematical programming approach to fitting the adclus model , 1980 .

[11]  P. Arabie,et al.  Indclus: An individual differences generalization of the adclus model and the mapclus algorithm , 1983 .

[12]  I. Mechelen,et al.  Analysis of similarity data and Tversky's contrast model , 1995 .

[13]  H. Kiers A modification of the SINDCLUS algorithm for fitting the ADCLUS and INDCLUS models , 1997 .

[14]  J. Marks,et al.  Easily searched encodings for number partitioning , 1996 .