Patient grouping optimization using a hybrid self-organizing map and Gaussian mixture model for length of stay-based clustering system

Clustering is a major tool in data analysis, dividing objects into different groups, based on unsupervised training procedures. Clustering algorithms attempt to group a set of objects into well-defined subgroups, based on some similarity between them. The results of the clustering process may not be confirmed by our knowledge of the data. The self-organizing map (SOM) neural network is an excellent tool in recognizing clusters of data, relating similar classes to each other in an unsupervised manner. Basically, SOM is used when the training dataset contains cases featuring input variables without the associated outputs. SOM can also be used for classification when output classes are immediately available; the advantage in this case is its ability to highlight similarities between classes, thus assessing different previous classification approaches. This paper explores the above ability of SOM to validate length of stay-based (LOS) clustering results that obtained using Gaussian mixture modeling (GMM) approach, by comparing the classification accuracy (percentage of samples correctly classified) of different results. The idea behind this attempt is the following: in the first step, each GMM approach provides its own scheme of grouping LOS, and different classes are thus recognized and labeled. In this step, we have considered GMM with different LOS intervals. In the second step, SOM will first learn to recognize clusters of data and, secondly, will compare its clusters map with the previous labeled clusters provided by GMM. To conclude, a closer similarity between previous clustering schemes and SOM clusters map, will results in a better accuracy for clustering LOS data. Ultimately, by comparing different GMM component models, the SOM application will lead to an optimal number of patient groups. An application to a surgical dataset showed the effectiveness of this methodology in determining the LOS intervals.

[1]  Peter H. Millard,et al.  Length of Stay-Based Clustering Methods for Patient Grouping , 2009 .

[2]  E. El-Darzi,et al.  Analysis of stopping criteria for the EM algorithm in the context of patient grouping according to length of stay , 2008, 2008 4th International IEEE Conference Intelligent Systems.

[3]  P. H. Millard,et al.  A simulation modelling approach to evaluating length of stay, occupancy, emptiness and bed blocking in a hospital geriatric department , 1998, Health care management science.

[4]  Teuvo Kohonen,et al.  Self-Organizing Maps, Third Edition , 2001, Springer Series in Information Sciences.

[5]  Elia El-Darzi,et al.  Length of Stay-Based Patient Flow Models: Recent Developments and Future Directions , 2005, Health care management science.

[6]  Leonid Churilov,et al.  Data Mining with Combined Use of Optimization Techniques and Self-Organizing Maps for Improving Risk Grouping Rules: Application to Prostate Cancer Patients , 2005, J. Manag. Inf. Syst..

[7]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[8]  R Ceglowski,et al.  Combining Data Mining and Discrete Event Simulation for a value-added view of a hospital emergency department , 2007, J. Oper. Res. Soc..

[9]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[10]  Peter H. Millard,et al.  A Simulation Model to Evaluate the Interaction between Acute, Rehabilitation, Long Stay Care and the Community , 2000 .

[11]  Peter H. Millard,et al.  Clustering patient length of stay using mixtures of Gaussian models and phase type distributions , 2009, 2009 22nd IEEE International Symposium on Computer-Based Medical Systems.

[12]  P. Harper,et al.  A review and comparison of classification algorithms for medical decision making. , 2005, Health policy.

[13]  Douglas G. Altman,et al.  Practical statistics for medical research , 1990 .

[14]  P. Harper A Framework for Operational Modelling of Hospital Resources , 2002, Health care management science.

[15]  Peter H. Millard,et al.  Markov model-based clustering for efficient patient care , 2005, 18th IEEE Symposium on Computer-Based Medical Systems (CBMS'05).

[16]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[17]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[18]  Xiaodong Lin,et al.  Degenerate Expectation-Maximization Algorithm for Local Dimension Reduction , 2004 .

[19]  Bernard J. Morzuch Forecasting Hospital Emergency Department Arrivals , 2006 .

[20]  Michael Walter,et al.  Automatic model acquisition and recognition of human gestures , 2002 .

[21]  Peter H. Millard,et al.  Length of stay based grouping and classification methodology for modelling patient flow , 2008 .

[22]  Leonid Churilov,et al.  Knowledge Discovery through Mining Emergency Department Data , 2005, Proceedings of the 38th Annual Hawaii International Conference on System Sciences.

[23]  P H Millard,et al.  Balancing acute and long-term care: the mathematics of throughput in departments of geriatric medicine. , 1991, Methods of information in medicine.

[24]  Balaji Rajagopalan,et al.  Data Mining to Support Simulation Modeling of Patient Flow in Hospitals , 2002, Journal of Medical Systems.