Feature selection for computer-aided detection: comparing different selection criteria

In this study we investigated different feature selection methods for use in computer-aided mass detection. The data set we used (1357 malignant mass regions and 58444 normal regions) was much larger than used in previous research where feature selection did not directly improve the performance compared to using the entire feature set. We introduced a new performance measure to be used during feature selection, defined as the mean sensitivity in an interval of the free response operating characteristic (FROC) curve computed on a logarithmic scale. This measure is similar to the final validation performance measure we were optimizing. Therefore it was expected to give better results than more general feature selection criteria. We compared the performance of feature sets selected using the mean sensitivity of the FROC curve to sets selected using the Wilks' lambda statistic and investigated the effect of reducing the skewness in the distribution of the feature values before performing feature selection. In the case of Wilks' lambda, we found that reducing skewness had a clear positive effect, yielding performances similar or exceeding performances obtained when the entire feature set was used. Our results indicate that a general measure like Wilks' lambda selects better performing feature sets than the mean sensitivity of the FROC curve.

[1]  Y H Chang,et al.  Feature selection for computerized mass detection in digitized mammograms by using a genetic algorithm. , 1999, Academic radiology.

[2]  Lubomir M. Hadjiiski,et al.  Feature selection and classifier performance in computer-aided diagnosis: the effect of finite sample size. , 2000, Medical physics.

[3]  Rangaraj M. Rangayyan,et al.  Classification of breast masses in mammograms using genetic programming and feature selection , 2006, Medical and Biological Engineering and Computing.

[4]  M. Giger,et al.  Computerized analysis of mammographic parenchymal patterns for breast cancer risk assessment: feature selection. , 2000, Medical physics.

[5]  N. Karssemeijer,et al.  A new 2D segmentation method based on dynamic programming applied to computer aided detection in mammography. , 2004, Medical physics.

[6]  N. Petrick,et al.  Computer-aided classification of mammographic masses and normal tissue: linear discriminant analysis in texture feature space. , 1995, Physics in medicine and biology.

[7]  Nico Karssemeijer,et al.  Single and multiscale detection of masses in digital mammograms , 1999, IEEE Transactions on Medical Imaging.

[8]  Nico Karssemeijer,et al.  Detection of stellate distortions in mammograms , 1996, IEEE Trans. Medical Imaging.

[9]  Bryan F. J. Manly,et al.  Exponential Data Transformations , 1976 .

[10]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[11]  N. Petrick,et al.  Computerized analysis of mammographic microcalcifications in morphological and texture feature spaces. , 1998, Medical physics.

[12]  Udi Manber,et al.  Introduction to algorithms - a creative approach , 1989 .

[13]  Paul Sajda,et al.  Role of feature selection in building pattern recognizers for computer-aided diagnosis , 1998, Medical Imaging.

[14]  H P Chan,et al.  Image feature selection by a genetic algorithm: application to classification of mass and normal breast tissue. , 1996, Medical physics.