Feature Selection and Sample Classification for SELDI-TOF Mass Spectrometry Data Based on Affinity Propagation Clustering

To analysis high throughput and high resolution mass spectrometry data effectively and capture the cancer related protein feature from the mass spectrometry data,diagnosis called a feature selection based on affinity propagation clustering of mass spectrometry was proposed in this paper.Firstly,the t-test was used on mass spectrometry data,followed by feature selection based on affinity propagation clustering.Next,affinity propagtion clustering and NS-LDA was used for reducing dimensions and correlation.Thirdly,SVM-RFE was used to select the features.Finally,we used four classifiers to estimate the performance of the algorithm.The proposed method was tested and evaluated on the ovarian cancer database OC-WCX2a,OC-WCX2b,and breast cancer database BC-WCX2a.Classification was achieved 96.43 %,99.66 % and 90.88 %,sensitivity was achieved 97.00 %,100 % and 96.17 %,specificity was achieved 95.85 %,99.08 % and 81.92 %,respectively.And 10 m/z features were selected for each dataset.The experimental results showed good performance of the method,and the method is expected to be used in cancer diagnosis.