Mining Personal Health Index from Annual Geriatric Medical Examinations

People take regular medical examinations mostly not for discovering diseases but for having a peace of mind regarding their health status. Therefore, it is important to give them an overall feedback with respect to all the health indicators that have been ranked against the whole population. In this paper, we propose a framework of mining Personal Health Index (PHI) from a large and comprehensive geriatric medical examination (GME) dataset. We define PHI as an overall score of personal health status based on a complement probability of health risks. The health risks are calculated using the information from the cause of death (COD) dataset that is linked to the GME dataset. Especially, the highest health risk is revealed in the cases of people who had been taking GME for some years and then passed away for medical reasons. The proposed framework consists of methods in data pre-processing, feature extraction and selection, and model selection. The effectiveness of the proposed framework is validated by a set of comprehensive experiments based on the records of 102,258 participants. As the first of this kind, our work provides a baseline for further research.

[1]  Seong-Il. Yi,et al.  Classification of Health Grade Using Bio-Check Unit and Health Index , 2011 .

[2]  Charu C. Aggarwal,et al.  Outlier Analysis , 2013, Springer New York.

[3]  Mark Woodward,et al.  Epidemiology: Study Design and Data Analysis , 1999 .

[4]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[5]  R M Kaplan,et al.  Health status: types of validity and the index of well-being. , 1976, Health services research.

[6]  Bernard C. Jiang,et al.  Application of classification techniques on development an early-warning system for chronic illnesses , 2012, Expert Syst. Appl..

[7]  JenChih-Hung,et al.  Application of classification techniques on development an early-warning system for chronic illnesses , 2012 .

[8]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[9]  Hsuan-Tien Lin,et al.  A note on Platt’s probabilistic outputs for support vector machines , 2007, Machine Learning.

[10]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[11]  Wei Luo,et al.  An integrated framework for suicide risk prediction , 2013, KDD.

[12]  Jian Pei,et al.  A brief survey on sequence classification , 2010, SKDD.

[13]  Eamonn J. Keogh,et al.  Experimental comparison of representation methods and distance measures for time series data , 2010, Data Mining and Knowledge Discovery.

[14]  D. Burke,et al.  Critical Care Scoring Systems , 2012 .

[15]  Amir-Masoud Eftekhari-Moghadam,et al.  Knowledge discovery in medicine: Current issue and future trend , 2014, Expert Syst. Appl..

[16]  Chih-Jen Lin,et al.  Combining SVMs with Various Feature Selection Strategies , 2006, Feature Extraction.

[17]  Yi Yang,et al.  Learning to predict health status of geriatric patients from observational data , 2012, 2012 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB).

[18]  Shahram Ebadollahi,et al.  Toward personalized care management of patients at risk: the diabetes case study , 2011, KDD.