Cautionary Remarks on the Use of Clusterwise Regression

Clusterwise linear regression is a multivariate statistical procedure that attempts to cluster objects with the objective of minimizing the sum of the error sums of squares for the within-cluster regression models. In this article, we show that the minimization of this criterion makes no effort to distinguish the error explained by the within-cluster regression models from the error explained by the clustering process. In some cases, most of the variation in the response variable is explained by clustering the objects, with little additional benefit provided by the within-cluster regression models. Accordingly, there is tremendous potential for overfitting with clusterwise regression, which is demonstrated with numerical examples and simulation experiments. To guard against the misuse of clusterwise regression, we recommend a benchmarking procedure that compares the results for the observed empirical data with those obtained across a set of random permutations of the response measures. We also demonstrate the potential for overfitting via an empirical application related to the prediction of reflective judgment using high school and college performance measures.

[1]  L. Tucker,et al.  An individual differences model for multidimensional scaling , 1963 .

[2]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[3]  J. Chang,et al.  Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition , 1970 .

[4]  C. F. Banfield,et al.  Algorithm AS 113: A Transfer for Non-Hierarchical Classification , 1977 .

[5]  Forrest W. Young,et al.  Nonmetric individual differences multidimensional scaling: An alternating least squares method with optimal scaling features , 1977 .

[6]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[7]  David J. Hand,et al.  Discrimination and Classification , 1982 .

[8]  E. A. Locke,et al.  Goal setting and task performance: 1969–1980. , 1981 .

[9]  E. A. Locke Relation of goal level to performance with a short work period and multiple goal levels. , 1982 .

[10]  W. DeSarbo,et al.  A maximum likelihood methodology for clusterwise linear regression , 1988 .

[11]  J. Breckenridge Replicating Cluster Analysis: Method, Consistency, and Validity. , 1989, Multivariate behavioral research.

[12]  M. Wedel,et al.  Consumer benefit segmentation using clusterwise linear regression , 1989 .

[13]  Wayne S. DeSarbo,et al.  A simulated annealing methodology for clusterwise linear regression , 1989 .

[14]  H. Späth Mathematical algorithms for linear regression , 1991 .

[15]  Points of view analysis revisited: Fitting multidimensional structures to optimal distance components with cluster restrictions on the variables , 1993 .

[16]  Gregory Ashby,et al.  On the Dangers of Averaging Across Subjects When Using Multidimensional Scaling or the Similarity-Choice Model , 1994 .

[17]  W. DeSarbo,et al.  A mixture likelihood approach for generalized linear models , 1995 .

[18]  W. DeSarbo,et al.  Typologies of Compulsive Buying Behavior: A Constrained Clusterwise Regression Approach , 1996 .

[19]  G. W. Milligan,et al.  CLUSTERING VALIDATION: RESULTS AND IMPLICATIONS FOR APPLIED ANALYSES , 1996 .

[20]  Paul E. Green,et al.  Modifying Cluster-Based Segments to Enhance Agreement with an Exogenous Response Variable , 1996 .

[21]  G. De Soete,et al.  Clustering and Classification , 2019, Data-Driven Science and Engineering.

[22]  W. DeSarbo,et al.  Finite-Mixture Structural Equation Models for Response-Based Segmentation and Unobserved Heterogeneity , 1997 .

[23]  M. Wedel,et al.  Market Segmentation: Conceptual and Methodological Foundations , 1997 .

[24]  Kamel Jedidi,et al.  STEMM: A General Finite Mixture Structural Equation Model , 1997 .

[25]  W. DeSarbo,et al.  Combinatorial Optimization Approaches to Constrained Market Segmentation: An Application to Industrial Market Segmentation , 1998 .

[26]  Pui Lam Leung,et al.  A mathematical programming approach to clusterwise regression model and its extensions , 1999, Eur. J. Oper. Res..

[27]  S. Dibb Market Segmentation: Conceptual and Methodological Foundations (2nd edition) , 2000 .

[28]  Christian Hennig,et al.  Identifiablity of Models for Clusterwise Linear Regression , 2000, J. Classif..

[29]  JACQUES-MARIE AURIFEILLE,et al.  A bio-mimetic approach to marketing segmentation: Principles and comparative analysis , 2000 .

[30]  Michel Wedel,et al.  GLIMMIX: Software for Estimating Mixtures and Mixtures of Generalized Linear Models , 2001 .

[31]  Michael J. Brusco,et al.  A Simulated Annealing Heuristic for a Bicriterion Partitioning Problem in Market Segmentation , 2002 .

[32]  Richard A. Brown,et al.  Patterns of change in depressive symptoms during smoking cessation: who's at risk for relapse? , 2002, Journal of consulting and clinical psychology.

[33]  W. Hartup,et al.  Heterogeneity among peer-rejected boys across middle childhood: developmental pathways of social behavior. , 2002, Developmental psychology.

[34]  Michael J. Brusco,et al.  Multicriterion Clusterwise Regression for Joint Segmentation Settings: An Application to Customer Value , 2003 .

[35]  Douglas Steinley,et al.  Local optima in K-means clustering: what you don't know may hurt you. , 2003, Psychological methods.

[36]  M. Lee,et al.  Avoiding the dangers of averaging across subjects when using multidimensional scaling , 2003 .

[37]  C. Preda,et al.  PLS Approach for Clusterwise Linear Regression on Functional Data , 2004 .

[38]  F. Leisch FlexMix: A general framework for finite mixture models and latent class regression in R , 2004 .

[39]  D. Steinley Properties of the Hubert-Arabie adjusted Rand index. , 2004, Psychological methods.

[40]  Patricia M. King,et al.  Reflective Judgment: Theory and Research on the Development of Epistemic Assumptions Through Adulthood , 2004 .

[41]  Helmuth Späth,et al.  Algorithm 39 Clusterwise linear regression , 1979, Computing.

[42]  M. Brusco,et al.  ConPar: a method for identifying groups of concordant subject proximity matrices for subsequent multidimensional scaling analyses , 2005 .

[43]  C. Müller,et al.  Simple consistent cluster methods based on redescending M-estimators with an application to edge identification in images , 2005 .

[44]  Helmuth Späth,et al.  A fast algorithm for clusterwise linear regression , 1982, Computing.