Mining ICDDR, B Hospital Surveillance Data Using Locally Linear Embedding Based SMOTE Algorithm and Multilayer Perceptron

In this paper, the authors use multilayer perceptron (MLP) on hospital surveillance data to categorize admitted patients according to their critical conditions which can be classified as - low, medium and high, to distinguish the criticality. The paper addresses the over-fitting problem in the unbalanced dataset using two distinct approaches since the frequency of instances of the class ‘low’ is significantly higher than other classes. Besides trimming, the unbalanced dataset is balanced by introducing the Synthetic Minority Over-sampling Technique (SMOTE) algorithm coupled with Locally Linear Embedding (LLE). We have constructed three models and applied neural classifications and compared the performances with the decision tree based models that already exist in literature. We show that one of our models outperforms prior models in classification, contingent upon performance time trade-off, giving us an efficient model that handles large scale unbalanced dataset efficiently with standard classification performance. The models developed in this research can become imperative tools to doctors during epidemics.