A mathematical programming approach to clusterwise regression model and its extensions

Abstract The clusterwise regression model is used to perform cluster analysis within a regression framework. While the traditional regression model assumes the regression coefficient (β) to be identical for all subjects in the sample, the clusterwise regression model allows β to vary with subjects of different clusters. Since the cluster membership is unknown, the estimation of the clusterwise regression is a tough combinatorial optimization problem. In this research, we propose a “Generalized Clusterwise Regression Model” which is formulated as a mathematical programming (MP) problem. A nonlinear programming procedure (with linear constraints) is proposed to solve the combinatorial problem and to estimate the cluster membership and β simultaneously. Moreover, by integrating the cluster analysis with the discriminant analysis, a clusterwise discriminant model is developed to incorporate parameter heterogeneity into the traditional discriminant analysis. The cluster membership and discriminant parameters are estimated simultaneously by another nonlinear programming model.

[1]  M. Rao Cluster Analysis and Mathematical Programming , 1971 .

[2]  Kim Fung Lam,et al.  MINIMIZING DEVIATIONS FROM THE GROUP MEAN: A NEW LINEAR PROGRAMMING APPROACH FOR THE TWO-GROUP CLASSIFICATION PROBLEM , 1996 .

[3]  David J. Hand,et al.  Discrimination and Classification , 1982 .

[4]  Pradit Wanarat,et al.  Examining the effect of second-order terms in mathematical programming approaches to the classification problem , 1996 .

[5]  H. Kitamura BookThe emerging Japanese superstate: Challenge and response: by Herman Kahn 274 pages. $7.95. Prentice-Hall, Inc, Englewood Cliffs, NJ , 1971 .

[6]  L. King DISCRIMINANT ANALYSIS: A REVIEW OF RECENT THEORETICAL CONTRIBUTIONS AND APPLICATIONS , 1970 .

[7]  Cliff T. Ragsdale,et al.  Mathematical Programming Formulations for the Discriminant Problem: An Old Dog Does New Tricks* , 1991 .

[8]  Emilio Casetti MULTIPLE DISCRIMINANT FUNCTIONS. , 1964 .

[9]  Gary Klein,et al.  Computer-Aided Process Structuring Via Mixed Integer Programming , 1988 .

[10]  Abraham Charnes,et al.  Optimal Estimation of Executive Compensation by Linear Programming , 1955 .

[11]  Gary J. Koehler,et al.  Minimizing Misclassifications in Linear Discriminant Analysis , 1990 .

[12]  Fred Glover,et al.  Applications and Implementation , 1981 .

[13]  D. Hand Cluster dissection and analysis: Helmuth SPATH Wiley, Chichester, 1985, 226 pages, £25.00 , 1986 .

[14]  Wayne S. DeSarbo,et al.  A simulated annealing methodology for clusterwise linear regression , 1989 .

[15]  H. Crowder,et al.  Cluster Analysis: An Application of Lagrangian Relaxation , 1979 .

[16]  Keinosuke Fukunaga,et al.  A Branch and Bound Clustering Algorithm , 1975, IEEE Transactions on Computers.

[17]  Gary Klein,et al.  A Clustering Algorithm for Computer‐Assisted Process Organization , 1989 .

[18]  M. Aitkin,et al.  Mixture Models, Outliers, and the EM Algorithm , 1980 .

[19]  Robert E. Jensen,et al.  A Dynamic Programming Algorithm for Cluster Analysis , 1969, Oper. Res..

[20]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[21]  H. Spath Cluster Dissection and Analysis , 1985 .

[22]  F. Glover,et al.  Simple but powerful goal programming models for discriminant problems , 1981 .

[23]  Antonie Stam,et al.  Extensions of mathematical programming-based classification rules: A multicriteria approach , 1990 .

[24]  L. Stanfel A recursive Lagrangian method for clustering problems , 1986 .

[25]  Paul A. Rubin,et al.  A comment regarding polynomial discriminant functions , 1994 .

[26]  Fred Glover,et al.  IMPROVED LINEAR PROGRAMMING MODELS FOR DISCRIMINANT ANALYSIS , 1990 .

[27]  Larry E. Stanfel,et al.  A Lagrangian treatment of certain nonlinear clustering problems , 1981 .

[28]  G. Celeux,et al.  Comparison of the mixture and the classification maximum likelihood in cluster analysis , 1993 .

[29]  Yadolah Dodge,et al.  Mathematical Programming In Statistics , 1981 .

[30]  J. Meier A fast algorithm for clusterwise linear absolute deviations regression , 1987 .

[31]  Ned Freed,et al.  EVALUATING ALTERNATIVE LINEAR PROGRAMMING MODELS TO SOLVE THE TWO-GROUP DISCRIMINANT PROBLEM , 1986 .

[32]  J. Mulvey,et al.  Solving capacitated clustering problems , 1984 .

[33]  Emilio Casetti CLASSIFICATORY AND REGIONAL ANALYSIS BY DISCRIMINANT ITERATIONS. , 1964 .