Similarity measurement of lung masses for medical image retrieval using kernel based semisupervised distance metric.

PURPOSE To develop a new algorithm to measure the similarity between the query lung mass and reference lung mass data set for content-based medical image retrieval (CBMIR). METHODS A lung mass data set including 746 mass regions of interest (ROIs) was assembled. Among them, 375 ROIs depicted malignant lesions and 371 depicted benign lesions. Each mass ROI is represented by a vector of 26 texture features. A kernel function was employed to map the original data in input space to a feature space. In this space, a semisupervised distance metric was learned, which used differential scatter discriminant criterion to represent the semantic relevance, and the regularization term to represent the visual similarity. The learned distance metric can measure the similarity of the query mass and reference mass data set. The clustering accuracy is used to configure the parameters. The retrieval accuracy and classification accuracy are used as the performance assessment index. RESULTS After configuring the parameters, a mean clustering accuracy of 0.87 can be achieved. For retrieval accuracy, our algorithm achieves better performance than other state-of-the-art retrieval algorithms when applying a leave-one-out validation method to the testing data set. For classification accuracy, the area under the ROC curve of our algorithm can be achieved as 0.941 ± 0.006. The running times of 346 query images with the proposed algorithm are 5.399 and 6.0 s, respectively. CONCLUSIONS The study results demonstrated the proposed algorithm outperforms the compared algorithms, when taking the semantic relevant and visual similarity into account in kernel space. The algorithm can be used in a CBMIR system for a query mass to retrieve similarity masses, which can help doctors make better decisions.

[1]  Wei Liu,et al.  Semi-supervised distance metric learning for collaborative image retrieval and clustering , 2010, ACM Trans. Multim. Comput. Commun. Appl..

[2]  Tomer Hertz,et al.  Learning a Mahalanobis Metric from Equivalence Constraints , 2005, J. Mach. Learn. Res..

[3]  Xuelong Li,et al.  Patch Alignment for Dimensionality Reduction , 2009, IEEE Transactions on Knowledge and Data Engineering.

[4]  Wei Liu,et al.  Learning Distance Metrics with Contextual Constraints for Image Retrieval , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[5]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[6]  Yousef Saad,et al.  Orthogonal Neighborhood Preserving Projections: A Projection-Based Dimensionality Reduction Technique , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  C. Floyd,et al.  Evaluation of information-theoretic similarity measures for content-based retrieval and detection of masses in mammograms. , 2006, Medical physics.

[8]  A. Jemal,et al.  Cancer statistics, 2015 , 2015, CA: a cancer journal for clinicians.

[9]  Wenqing Sun,et al.  Computerized lung cancer malignancy level analysis using 3D texture features , 2016, SPIE Medical Imaging.

[10]  Max A. Viergever,et al.  Computer-aided diagnosis in chest radiography: a survey , 2001, IEEE Transactions on Medical Imaging.

[11]  Xuelong Li,et al.  Spectral Embedded Hashing for Scalable Image Retrieval , 2014, IEEE Transactions on Cybernetics.

[12]  William M. Wells,et al.  Simultaneous truth and performance level estimation (STAPLE): an algorithm for the validation of image segmentation , 2004, IEEE Transactions on Medical Imaging.

[13]  Xuelong Li,et al.  Discriminative Orthogonal Neighborhood-Preserving Projections for Classification , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[14]  K. Doi,et al.  Investigation of psychophysical measure for evaluation of similar images for mammographic masses: Preliminary results. , 2005, Medical physics.

[15]  Rong Jin,et al.  A Boosting Framework for Visuality-Preserving Distance Metric Learning and Its Application to Medical Image Retrieval , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Wen He,et al.  Assessing the use of digital radiography and a real-time interactive pulmonary nodule analysis system for large population lung cancer screening. , 2012, European journal of radiology.

[17]  Robert M. Haralick,et al.  Textural Features for Image Classification , 1973, IEEE Trans. Syst. Man Cybern..

[18]  Shih-Fu Chang,et al.  Semi-Supervised Hashing for Large-Scale Search , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Wenqing Sun,et al.  Using multiscale texture and density features for near-term breast cancer risk analysis. , 2015, Medical physics.

[20]  Hong Liu,et al.  Assessment of performance and reproducibility of applying a content-based image retrieval scheme for classification of breast lesions. , 2015, Medical physics.

[21]  B. van Ginneken,et al.  Computer-aided diagnosis in high resolution CT of the lungs. , 2003, Medical physics.

[22]  Jun Yu,et al.  Semantic preserving distance metric learning and applications , 2014, Inf. Sci..

[23]  Xuelong Li,et al.  General Tensor Discriminant Analysis and Gabor Features for Gait Recognition , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Alexander McGregor,et al.  Lung cancer screening using low-dose computed tomography in at-risk individuals: the Toronto experience. , 2010, Lung cancer.

[25]  David Dagan Feng,et al.  Content-Based Medical Image Retrieval: A Survey of Applications to Multidimensional and Multimodality Data , 2013, Journal of Digital Imaging.

[26]  Richard C. Pais,et al.  The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): a completed reference database of lung nodules on CT scans. , 2011, Medical physics.