A global optimisation approach to classification in medical diagnosis and prognosis

Global optimisation based techniques are studied in order to increase the accuracy of medical diagnosis and prognosis with FNA image data from the Wisconsin Diagnostic and Prognostic Breast Cancer databases. First we discuss the problem of determining the most informative features for the classification of cancerous cases in the databases under consideration. Then we apply a technique based on convex and global optimisation to breast cancer diagnosis. It allows the classification of benign cases and malignant ones and the subsequent diagnosis of patients with very high accuracy. The third application of this technique is a method that calculates centres of clusters to predict when breast cancer is likely to recur in patients for which cancer has been removed. The technique achieves higher accuracy with these databases than reported elsewhere in the literature.

[1]  Michael L. Overton,et al.  A quadratically convergent method for minimizing a sum of euclidean norms , 1983, Math. Program..

[2]  B. M. Glover,et al.  Cutting angle methods in global optimization , 1999 .

[3]  Vladimir Vapnik,et al.  The Nature of Statistical Learning , 1995 .

[4]  Douglas M. Hawkins,et al.  Topics in Applied Multivariate Analysis: CLUSTER ANALYSIS , 1982 .

[5]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[6]  M. Rao Cluster Analysis and Mathematical Programming , 1971 .

[7]  Hongxing He,et al.  Optimising the Distance Metric in the Nearest Neighbour Algorithm on a Real-World Patient Classification Problem , 1999, PAKDD.

[8]  Khaled S. Al-Sultan,et al.  A Tabu search approach to the clustering problem , 1995, Pattern Recognit..

[9]  W. N. Street,et al.  Image analysis and machine learning applied to breast cancer diagnosis and prognosis. , 1995, Analytical and quantitative cytology and histology.

[10]  M. R. Mickey,et al.  Estimation of Error Rates in Discriminant Analysis , 1968 .

[11]  C. S. Wallace,et al.  An Information Measure for Classification , 1968, Comput. J..

[12]  Keinosuke Fukunaga,et al.  Statistical Pattern Recognition , 1993, Handbook of Pattern Recognition and Computer Vision.

[13]  Olvi L. Mangasarian,et al.  Nuclear feature extraction for breast tumor diagnosis , 1993, Electronic Imaging.

[14]  W. N. Street,et al.  Computer-derived nuclear features distinguish malignant from benign breast cytology. , 1995, Human pathology.

[15]  Paul S. Bradley,et al.  Clustering via Concave Minimization , 1996, NIPS.

[16]  Alexander Rubinov,et al.  Lipschitz programming via increasing convex-along-rays functions * , 1999 .

[17]  M F Janowitz Cluster Analysis Algorithms for Image Segmentation. , 1981 .

[18]  Maryanne Domm Mathematical programming in data mining: Models for binary classification with application to collusion detection in online gambling , 2003 .

[19]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[20]  W. N. Street,et al.  Computerized breast cancer diagnosis and prognosis from fine-needle aspirates. , 1995, Archives of surgery.

[21]  O. Mangasarian,et al.  Robust linear programming discrimination of two linearly inseparable sets , 1992 .

[22]  W. N. Street,et al.  Machine learning techniques to diagnose breast cancer from image-processed nuclear features of fine needle aspirates. , 1994, Cancer letters.

[23]  Shokri Z. Selim,et al.  K-Means-Type Algorithms: A Generalized Convergence Theorem and Characterization of Local Optimality , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Adil M. Bagirov,et al.  Global Minimization of Increasing Positively Homogeneous Functions over the Unit Simplex , 2000, Ann. Oper. Res..

[25]  Adil M. Bagirov Minimization Methods for One Class of Nonsmooth Functions and Calculation of Semi-Equilibrium Prices , 1999 .

[26]  William Nick Street,et al.  Breast Cancer Diagnosis and Prognosis Via Linear Programming , 1995, Oper. Res..

[27]  J. Ross Quinlan,et al.  Improved Use of Continuous Attributes in C4.5 , 1996, J. Artif. Intell. Res..

[28]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[29]  Adil M. Bagirov,et al.  A method for minimizing convex functions based on continuous approximations to the subdifferential , 1998 .

[30]  A. Rubinov Abstract Convexity and Global Optimization , 2000 .