Medical Case Retrieval From a Committee of Decision Trees

A novel content-based information retrieval framework, designed to cover several medical applications, is presented in this paper. The presented framework allows the retrieval of possibly incomplete medical cases consisting of several images together with semantic information. It relies on a committee of decision trees, decision support tools well suited to process this type of information. In our proposed framework, images are characterized by their digital content. It was applied to two heterogeneous medical datasets for computer-aided diagnoses: a diabetic retinopathy follow-up dataset (DRD) and a mammography-screening dataset (DDSM). Measure of precision among the top five retrieved results of 0.788 ± 0.137 and 0.869 ± 0.161 was obtained on DRD and DDSM, respectively. On DRD, for instance, it increases by half the retrieval of single images.

[1]  Agnar Aamodt,et al.  Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches , 1994, AI Commun..

[2]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[3]  Rangaraj M. Rangayyan,et al.  Content-based retrieval and analysis of mammographic masses , 2005, J. Electronic Imaging.

[4]  Paul Scheunders,et al.  Statistical texture characterization from discrete wavelet representations , 1999, IEEE Trans. Image Process..

[5]  Ana I. González Acuña An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, Boosting, and Randomization , 2012 .

[6]  Antoine Geissbühler,et al.  A Review of Content{Based Image Retrieval Systems in Medical Applications { Clinical Bene(cid:12)ts and Future Directions , 2022 .

[7]  J. R. Quinlan Learning With Continuous Classes , 1992 .

[8]  Patrice Degoulet,et al.  Towards content-based image retrieval in a HIS-integrated PACS , 2000, AMIA.

[9]  Michael W. Marcellin,et al.  JPEG2000 - image compression fundamentals, standards and practice , 2002, The Kluwer International Series in Engineering and Computer Science.

[10]  Gwénolé Quellec,et al.  Optimal Wavelet Transform for the Detection of Microaneurysms in Retina Photographs , 2008, IEEE Transactions on Medical Imaging.

[11]  Isabelle Bichindaritz,et al.  Case-based reasoning in the health sciences: What's next? , 2006, Artif. Intell. Medicine.

[12]  金田 重郎,et al.  C4.5: Programs for Machine Learning (書評) , 1995 .

[13]  Tony R. Martinez,et al.  Improved Heterogeneous Distance Functions , 1996, J. Artif. Intell. Res..

[14]  Matthew D. Davis,et al.  Proposed international clinical diabetic retinopathy and diabetic macular edema disease severity scales. , 2003, Ophthalmology.

[15]  Herna L. Viktor,et al.  Learning from imbalanced data sets with boosting and data generation: the DataBoost-IM approach , 2004, SKDD.

[16]  Gwénolé Quellec,et al.  Wavelet optimization for content-based image retrieval in medical databases , 2010, Medical Image Anal..

[17]  Minh N. Do,et al.  Wavelet-based texture retrieval using generalized Gaussian density and Kullback-Leibler distance , 2002, IEEE Trans. Image Process..

[18]  Christian Roux,et al.  Computer-assisted diagnosis system in digestive endoscopy , 2003, IEEE Transactions on Information Technology in Biomedicine.

[19]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Joydeep Ghosh,et al.  Relationship-based clustering and cluster ensembles for high-dimensional data mining , 2002 .

[21]  F. Ferris,et al.  Risk factors for high-risk proliferative diabetic retinopathy and severe visual loss: Early Treatment Diabetic Retinopathy Study Report #18. , 1998, Investigative ophthalmology & visual science.

[22]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[23]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[24]  Hong Zhao,et al.  Medical image retrieval based on visual contents and text information , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).

[25]  Rene Vargas-Voracek,et al.  Computer-assisted detection of mammographic masses: a template matching scheme based on mutual information. , 2003, Medical physics.

[26]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[27]  Richard H. Moore,et al.  Current Status of the Digital Database for Screening Mammography , 1998, Digital Mammography / IWDM.

[28]  L. Rodney Long,et al.  A Biomedical Information System for Combined Content-Based Retrieval of Spine X-Ray Images, Associated Text Information , 2002, ICVGIP.

[29]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[30]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[31]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..