Case-based retrieval of similar diabetic patients

Patients suffering from diabetes often develop several comorbidities such as hypertension and dyslipidemia. The presence of the comorbidities leads to more complex patient profiles associated with specific patient treatments. In this paper we present a novel algorithm to help physicians, given a new case, in retrieving similar past patient cases. This novel algorithm is based on the bag-of-words (BoW) model to encode as features, the occurrence of each pre-computed cluster, for each patient, according to the approach of document classification. We then evaluate the algorithm on a real de-identified dataset of 3201 diabetic patients, demonstrating the advantage of our approach.

[1]  Dimitris K. Iakovidis,et al.  Fusion of multimodal temporal clinical data for the retrieval of similar patient cases , 2011, 2011 10th International Workshop on Biomedical Engineering.

[2]  Andrew Zisserman,et al.  Efficient Visual Search of Videos Cast as Text Retrieval , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  B. Hewitson,et al.  Self-organizing maps: applications to synoptic climatology , 2002 .

[4]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[5]  Teuvo Kohonen,et al.  Self-organized formation of topologically correct feature maps , 2004, Biological Cybernetics.

[6]  Youngjoong Ko,et al.  A study of term weighting schemes using class information for text classification , 2012, SIGIR '12.

[7]  Tom Armstrong,et al.  Using Modified Multivariate Bag-of-Words Models to Classify Physiological Data , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.

[8]  Robert H. Weisberg,et al.  Patterns of ocean current variability on the West Florida Shelf using the self-organizing map , 2005 .

[9]  ZissermanAndrew,et al.  Efficient Visual Search of Videos Cast as Text Retrieval , 2009 .

[10]  Hujun Yin,et al.  Learning Nonlinear Principal Manifolds by Self-Organising Maps , 2008 .

[11]  Jon Louis Bentley,et al.  An Algorithm for Finding Best Matches in Logarithmic Expected Time , 1977, TOMS.

[12]  J. Caers,et al.  Stochastic Simulation of Patterns Using Distance-Based Pattern Modeling , 2010 .

[13]  H.P. Ng,et al.  Medical Image Segmentation Using K-Means Clustering and Improved Watershed Algorithm , 2006, 2006 IEEE Southwest Symposium on Image Analysis and Interpretation.

[14]  Eamonn J. Keogh,et al.  A symbolic representation of time series, with implications for streaming algorithms , 2003, DMKD '03.

[15]  Ralph Bergmann Highlights of the European INRECA Projects , 2001, ICCBR.

[16]  Jimeng Sun,et al.  Localized Supervised Metric Learning on Temporal Physiological Data , 2010, 2010 20th International Conference on Pattern Recognition.