Data Mining via Generalized Support Vector Machines
暂无分享,去创建一个
Abstract : Generalized Support Vector Machines were used to extract valuable information from datasets and construct fast classification algorithms for massive data. The influence of chemotherapy was investigated on breast cancer patients by obtaining well separated Kaplan-Meier survival curves for three classes of patients. A novel approach was proposed for using a minimal number of data points in order to generate an accurate classifier. Substantial progress was also made towards achieving new results in the field of data mining by using the extremely versatile and highly effective approach of support vector machines. In particular, minimal kernel classifiers were constructed that use minimal subset of the data. A new type of classifier, the proximal classifier, was proposed and implemented which is basically an order of magnitude faster than conventional classifiers. The effect of chemotherapy on breast cancer patients was more accurately assessed. An incremental classification algorithm was proposed, implemented and was capable of classifying a billion points in less than three hours on a 400Mhz machine. New techniques for incorporating prior expert knowledge such as medical doctors experience into classifiers were devised and computationally implemented. Very fast Newton methods were proposed and successfully tried for extremely large classification problems and linear programming problems.