Unsupervised and supervised data classification via nonsmooth and global optimization

We examine various methods for data clustering and data classification that are based on the minimization of the so-called cluster function and its modications. These functions are nonsmooth and nonconvex. We use Discrete Gradient methods for their local minimization. We consider also a combination of this method with the cutting angle method for global minimization. We present and discuss results of numerical experiments.

[1]  P. Sneath,et al.  Numerical Taxonomy , 1962, Nature.

[2]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[3]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[4]  Benjamin King Step-Wise Clustering Procedures , 1967 .

[5]  George Nagy,et al.  State of the art in pattern recognition , 1968 .

[6]  Robert E. Jensen,et al.  A Dynamic Programming Algorithm for Cluster Analysis , 1969, Oper. Res..

[7]  Michael R. Anderberg,et al.  Cluster Analysis for Applications , 1973 .

[8]  Brian Everitt,et al.  Cluster analysis , 1974 .

[9]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[10]  Keinosuke Fukunaga,et al.  A Branch and Bound Clustering Algorithm , 1975, IEEE Transactions on Computers.

[11]  Anil K. Jain,et al.  Clustering techniques: The user's dilemma , 1976, Pattern Recognit..

[12]  R. Mifflin Semismooth and Semiconvex Functions in Constrained Optimization , 1977 .

[13]  King-Sun Fu,et al.  A Sentence-to-Sentence Clustering Procedure for Pattern Analysis , 1978, IEEE Transactions on Systems, Man, and Cybernetics.

[14]  Robert F. Ling,et al.  Classification and Clustering. , 1979 .

[15]  Vijay V. Raghavan,et al.  A Comparison of the Stability Characteristics of Some Graph Theoretic Clustering Methods , 1981, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Douglas M. Hawkins,et al.  Topics in Applied Multivariate Analysis: CLUSTER ANALYSIS , 1982 .

[17]  Fionn Murtagh,et al.  A Survey of Recent Advances in Hierarchical Clustering Algorithms , 1983, Comput. J..

[18]  F. Clarke Optimization And Nonsmooth Analysis , 1983 .

[19]  Shokri Z. Selim,et al.  K-Means-Type Algorithms: A Generalized Convergence Theorem and Characterization of Local Optimality , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  K. Kiwiel Methods of Descent for Nondifferentiable Optimization , 1985 .

[21]  Dominique Peeters,et al.  A comparison of two dual-based procedures for solving the p-median problem , 1985 .

[22]  G. Diehr Evaluation of a Branch and Bound Algorithm for Clustering , 1985 .

[23]  Editors , 1986, Brain Research Bulletin.

[24]  John J. Grefenstette,et al.  Optimization of Control Parameters for Genetic Algorithms , 1986, IEEE Transactions on Systems, Man, and Cybernetics.

[25]  Teuvo Kohonen,et al.  Self-Organization and Associative Memory , 1988 .

[26]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[27]  Emile H. L. Aarts,et al.  Simulated annealing and Boltzmann machines. A stochastic approach to combinatorial optimization , 1989 .

[28]  D. E. Goldberg,et al.  Genetic Algorithms in Search , 1989 .

[29]  Emile H. L. Aarts,et al.  Simulated annealing and Boltzmann machines - a stochastic approach to combinatorial optimization and neural computing , 1990, Wiley-Interscience series in discrete mathematics and optimization.

[30]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[31]  Stephen Grossberg,et al.  ART 3: Hierarchical search using chemical transmitters in self-organizing pattern recognition architectures , 1990, Neural Networks.

[32]  Shokri Z. Selim,et al.  A simulated annealing algorithm for the clustering problem , 1991, Pattern Recognit..

[33]  Donald R. Jones,et al.  Solving Partitioning Problems with Genetic Algorithms , 1991, International Conference on Genetic Algorithms.

[34]  Vijay V. Raghavan,et al.  Genetic Algorithm for Clustering with an Ordered Representation , 1991, ICGA.

[35]  Ricardo A. Baeza-Yates,et al.  Introduction to Data Structures and Algorithms Related to Information Retrieval , 1992, Information Retrieval: Data Structures & Algorithms.

[36]  O. Mangasarian,et al.  Robust linear programming discrimination of two linearly inseparable sets , 1992 .

[37]  Donald E. Brown,et al.  A practical application of simulated annealing to clustering , 1990, Pattern Recognit..

[38]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[39]  Adil M. Bagirov,et al.  A method of approximating a quasidifferential , 1992 .

[40]  J. Hiriart-Urruty,et al.  Convex analysis and minimization algorithms , 1993 .

[41]  M. Narasimha Murty,et al.  A near-optimal initial seed value selection in K-means means algorithm using a genetic algorithm , 1993, Pattern Recognit. Lett..

[42]  F. Glover,et al.  In Modern Heuristic Techniques for Combinatorial Problems , 1993 .

[43]  David B. Fogel,et al.  An introduction to simulated evolutionary optimization , 1994, IEEE Trans. Neural Networks.

[44]  Olvi L. Mangasarian,et al.  Misclassification minimization , 1994, J. Glob. Optim..

[45]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[46]  Ru-Qin Yu,et al.  Cluster Analysis by Simulated Annealing , 1994, Comput. Chem..

[47]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[48]  Khaled S. Al-Sultan,et al.  A Tabu search approach to the clustering problem , 1995, Pattern Recognit..

[49]  V. F. Demʹi︠a︡nov,et al.  Constructive nonsmooth analysis , 1995 .

[50]  Benjamin W. Wah,et al.  Global Optimization for Neural Network Training , 1996, Computer.

[51]  Olvi L. Mangasarian,et al.  Hybrid misclassification minimization , 1996, Adv. Comput. Math..

[52]  Khaled S. Al-Sultan,et al.  Computational experience on four algorithms for the hard clustering problem , 1996, Pattern Recognit. Lett..

[53]  Anil K. Jain,et al.  Artificial Neural Networks: A Tutorial , 1996, Computer.

[54]  Boris Mirkin,et al.  Mathematical Classification and Clustering , 1996 .

[55]  Paul S. Bradley,et al.  Feature Selection via Concave Minimization and Support Vector Machines , 1998, ICML.

[56]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[57]  Hans-Hermann Bock,et al.  Clustering and Neural Networks , 1998 .

[58]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[59]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[60]  Hans-Hermann Bock,et al.  Advances in data science and classification , 1998 .

[61]  B. M. Glover,et al.  Cutting angle methods in global optimization , 1999 .

[62]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[63]  Adil M. Bagirov Minimization Methods for One Class of Nonsmooth Functions and Calculation of Semi-Equilibrium Prices , 1999 .

[64]  Paul S. Bradley,et al.  Mathematical Programming for Data Mining: Formulations and Challenges , 1999, INFORMS J. Comput..

[65]  O. Mangasarian,et al.  Massive data discrimination via linear support vector machines , 2000 .

[66]  A. Rubinov Abstract Convexity and Global Optimization , 2000 .

[67]  D Haussler,et al.  Knowledge-based analysis of microarray gene expression data by using support vector machines. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[68]  Adil M. Bagirov,et al.  Global Minimization of Increasing Positively Homogeneous Functions over the Unit Simplex , 2000, Ann. Oper. Res..

[69]  A. Bagirov Numerical Methods for Minimizing Quasidifferentiable Functions: A Survey and Comparison , 2000 .

[70]  A Bagirov,et al.  Using global optimization to improve classification for medical diagnosis and prognosis. , 2001, Topics in health information management.

[71]  Inderjit S. Dhillon,et al.  Efficient Clustering of Very Large Document Collections , 2001 .

[72]  A. M. Rubinov,et al.  MODIFIED VERSIONS OF THE CUTTING ANGLE METHOD , 2001 .

[73]  Lynn Margaret Batten,et al.  Fast Algorithm for the Cutting Angle Method of Global Optimization , 2002, J. Glob. Optim..

[74]  Adil M. Bagirov,et al.  A Heuristic Algorithm for Feature Selection Based on Optimization Techniques , 2002 .

[75]  A. Bagirov,et al.  A Global Optimization Approach to Classification , 2002 .

[76]  Adil M. Bagirov,et al.  Cutting Angle Method and a Local Search , 2003, J. Glob. Optim..

[77]  Zhaohao Sun,et al.  R5 model for case-based reasoning , 2003, Knowl. Based Syst..

[78]  B. Jaumard,et al.  Cluster Analysis and Mathematical Programming , 2003 .

[79]  Bernhard Schölkopf,et al.  Training Invariant Support Vector Machines , 2002, Machine Learning.

[80]  Olvi L. Mangasarian,et al.  Mathematical Programming in Data Mining , 1997, Data Mining and Knowledge Discovery.