Segmentation problems

We study a novel genre of optimization problems, which we call segmentation problems, motivated in part by certain aspects of clustering and data mining. For any classical optimization problem, the corresponding segmentation problem seeks to partition a set of cost vectors into several segments, so that the overall cost is optimized. We focus on two natural and interesting (but MAXSNP-complete) problems in this class, the hypercube segmentation problem and the catalog segmentation problem, and present approximation algorithms for them. We also present a general greedy scheme, which can be specialized to approximate any segmentation problem.

[1]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[2]  Nesa L'abbe Wu,et al.  Linear programming and extensions , 1981 .

[3]  David Sankoff,et al.  Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison , 1983 .

[4]  Teofilo F. GONZALEZ,et al.  Clustering to Minimize the Maximum Intercluster Distance , 1985, Theor. Comput. Sci..

[5]  Tomás Feder,et al.  Optimal algorithms for approximate clustering , 1988, STOC '88.

[6]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[7]  Padhraic Smyth,et al.  Rule Induction Using Information Theory , 1991, Knowledge Discovery in Databases.

[8]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[9]  Gregory Piatetsky-Shapiro,et al.  The interestingness of deviations , 1994 .

[10]  Marek Karpinski,et al.  Polynomial time approximation schemes for dense instances of NP-hard problems , 1995, STOC '95.

[11]  Oded Berman,et al.  Flow-Interception Problems , 1995 .

[12]  Abraham Silberschatz,et al.  What Makes Patterns Interesting in Knowledge Discovery Systems , 1996, IEEE Trans. Knowl. Data Eng..

[13]  Philip S. Yu,et al.  Data Mining: An Overview from a Database Perspective , 1996, IEEE Trans. Knowl. Data Eng..

[14]  Said Salhi,et al.  Facility Location: A Survey of Applications and Methods , 1996 .

[15]  Gregory Piatetsky-Shapiro,et al.  A Comparison of Approaches for Maximizing Business Payoff of Prediction Models , 1996, KDD.

[16]  D. Eppstein,et al.  Approximation algorithms for geometric problems , 1996 .

[17]  Wynne Hsu,et al.  Post-Analysis of Learned Rules , 1996, AAAI/IAAI, Vol. 1.

[18]  Dorit S. Hochbaum,et al.  Approximation Algorithms for NP-Hard Problems , 1996 .

[19]  Éva Tardos,et al.  Approximation algorithms for facility location problems (extended abstract) , 1997, STOC '97.

[20]  Rajeev Motwani,et al.  Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[21]  Dimitrios Gunopulos,et al.  Data mining, hypergraph transversals, and machine learning (extended abstract) , 1997, PODS.

[22]  Yishay Mansour,et al.  An Information-Theoretic Analysis of Hard and Soft Assignment Methods for Clustering , 1997, UAI.

[23]  Dimitrios Gunopulos,et al.  Data mining, hypergraph transversals, and machine learning (extended abstract) , 1997, PODS '97.

[24]  MotwaniRajeev,et al.  Dynamic itemset counting and implication rules for market basket data , 1997 .

[25]  K. Aardal,et al.  Approximation algorithms for facility location problems (extended abstract) , 1997, STOC '97.

[26]  Marek Karpinski,et al.  Polynomial Time Approximation Schemes for Dense Instances of NP-Hard Problems , 1999, J. Comput. Syst. Sci..

[27]  Noga Alon,et al.  On Two Segmentation Problems , 1999, J. Algorithms.

[28]  David B. Shmoys,et al.  Approximation algorithms for facility location problems , 2000, APPROX.