A methodological review of data mining techniques in predictive medicine: An application in hemodynamic prediction for abdominal aortic aneurysm disease

Abstract Modern clinics and hospitals need accurate real-time prediction tools. This paper reviews the importance and present trends of data mining methodologies in predictive medicine by focusing on hemodynamic predictions in abdominal aortic aneurysm (AAA). It also provides potential data mining working frameworks for hemodynamic predictions in AAA. These frameworks either allow the coupling between a typical computational modeling simulation and various data mining techniques, using the existing medical datasets of real-patient and mining it directly using various data mining techniques or implementing visual data mining approach to already available computed results of various hemodynamic features within the AAA models. These approaches allow the possibility of statistically predicting rupture potentials of aneurismal patients and ideally provide an alternate solution for substituting tedious and time-consuming computational modeling. Prediction trends of patient-specific aneurismal conditions via mining huge volume of medical data can also speed up the decision making process in real life medicine.

[1]  Clement Kleinstreuer,et al.  Analysis and computer program for rupture-risk prediction of abdominal aortic aneurysms , 2006, Biomedical engineering online.

[2]  Y. Cho,et al.  Effects of the non-Newtonian viscosity of blood on flows in a diseased arterial vessel. Part 1: Steady flows. , 1991, Biorheology.

[3]  Johannes Gehrke,et al.  BOAT—optimistic decision tree construction , 1999, SIGMOD '99.

[4]  Jonathan P Vande Geest,et al.  Biomechanical determinants of abdominal aortic aneurysm rupture. , 2005, Arteriosclerosis, thrombosis, and vascular biology.

[5]  Anders Krogh,et al.  Improving Predicition of Protein Secondary Structure Using Structured Neural Networks and Multiple Sequence Alignments , 1996, J. Comput. Biol..

[6]  M. Hendrickson,et al.  Proposed Criteria for the Diagnosis of Well‐Differentiated Endometrial Carcinoma: A Diagnostic Test for Myoinvasion , 1995, The American journal of surgical pathology.

[7]  Elena S. Di Martino,et al.  Three-dimensional geometrical characterization of abdominal aortic aneurysms: image-based wall thickness distribution. , 2009, Journal of biomechanical engineering.

[8]  N R Temkin,et al.  Classification and regression trees (CART) for prediction of function at 1 year following head trauma. , 1995, Journal of neurosurgery.

[9]  E A Finol,et al.  The effect of asymmetry in abdominal aortic aneurysms under physiologically realistic pulsatile flow conditions. , 2003, Journal of biomechanical engineering.

[10]  S. Rodenhuis,et al.  Validation of techniques for the prediction of carboplatin exposure: Application of Bayesian methods , 2000, Clinical pharmacology and therapeutics.

[11]  Lior Rokach,et al.  Data Mining And Knowledge Discovery Handbook , 2005 .

[12]  Elena S. Di Martino,et al.  Fluid-structure interaction within realistic three-dimensional models of the aneurysmatic aorta as a guidance to assess the risk of rupture of the aneurysm. , 2001, Medical engineering & physics.

[13]  Igor Kononenko,et al.  Machine learning in prognosis of the femoral neck fracture recovery , 1996, Artif. Intell. Medicine.

[14]  Saurabh Ghosh,et al.  Mapping a quantitative trait locus via the EM algorithm and Bayesian classification , 2000, Genetic epidemiology.

[15]  Michael M. Resch,et al.  Pulsatile non-Newtonian flow characteristics in a three-dimensional human carotid bifurcation model. , 1991, Journal of biomechanical engineering.

[16]  G. V. Kass An Exploratory Technique for Investigating Large Quantities of Categorical Data , 1980 .

[17]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[18]  E. Snyder,et al.  Identification of protein coding regions in genomic DNA. , 1995, Journal of molecular biology.

[19]  B. Rost,et al.  Combining evolutionary information and neural networks to predict protein secondary structure , 1994, Proteins.

[20]  S Snowden,et al.  Ruptured abdominal aortic aneurysm: a novel method of outcome prediction using neural network technology. , 2000, European journal of vascular and endovascular surgery : the official journal of the European Society for Vascular Surgery.

[21]  W. Loh,et al.  Tree-Structured Classification via Generalized Discriminant Analysis. , 1988 .

[22]  Anne Newman,et al.  Cardiovascular Disease and Mortality in Older Adults with Small Abdominal Aortic Aneurysms Detected by Ultrasonography: The Cardiovascular Health Study , 2001, Annals of Internal Medicine.

[23]  W. Loh,et al.  SPLIT SELECTION METHODS FOR CLASSIFICATION TREES , 1997 .

[24]  A. Leuprecht,et al.  Computer Simulation of Non-Newtonian Effects on Blood Flow in Large Arteries , 2001, Computer methods in biomechanics and biomedical engineering.

[25]  Ender A. Finol,et al.  Quantitative Assessment of Abdominal Aortic Aneurysm Geometry , 2010, Annals of Biomedical Engineering.

[26]  Shigeru Obayashi,et al.  Implementation of visual data mining for unsteady blood flow field in an aortic aneurysm , 2011, J. Vis..

[27]  Lior Rokach,et al.  Data Mining and Knowledge Discovery Handbook, 2nd ed , 2010, Data Mining and Knowledge Discovery Handbook, 2nd ed..

[28]  Alexander D. Shkolnik,et al.  Fluid-structure interaction in abdominal aortic aneurysms: effects of asymmetry and wall thickness , 2005, Biomedical engineering online.

[29]  Jorma Rissanen,et al.  SLIQ: A Fast Scalable Classifier for Data Mining , 1996, EDBT.

[30]  R. D'Agostino,et al.  A comparison of performance of mathematical predictive methods for medical diagnosis: identifying acute cardiac ischemia among emergency department patients. , 1995, Journal of investigative medicine : the official publication of the American Federation for Clinical Research.

[31]  Rakesh Agrawal,et al.  SPRINT: A Scalable Parallel Classifier for Data Mining , 1996, VLDB.

[32]  Hsinchun Chen,et al.  Medical Informatics: Knowledge Management and Data Mining in Biomedicine (Operations Research/Computer Science Interfaces) , 2005 .

[33]  Anders Gorm Pedersen,et al.  Neural Network Prediction of Translation Initiation Sites in Eukaryotes: Perspectives for EST and Genome Analysis , 1997, ISMB.

[34]  Neil W Bressloff,et al.  Mining data from hemodynamic simulations via Bayesian emulation , 2007, Biomedical engineering online.

[35]  M M Thompson,et al.  Arterial aneurysms , 2000, BMJ : British Medical Journal.

[36]  T. Sejnowski,et al.  Predicting the secondary structure of globular proteins using neural network models. , 1988, Journal of molecular biology.

[37]  I. Lossos,et al.  Cerebrospinal fluid lactate dehydrogenase isoenzyme analysis for the diagnosis of central nervous system involvement in hematooncologic patients , 2000, Cancer.

[38]  N. Sakalihasan,et al.  Factors Promoting Rupture of Abdominal Aortic Aneurysms , 2005, Acta chirurgica Belgica.

[39]  Simon Kasif,et al.  Chapter 15 Modeling biological data and structure with probabilistic networks , 1998 .

[40]  Peggo K. W. Lam,et al.  Derivation of a prediction rule for post-traumatic acute lung injury. , 1999, Resuscitation.

[41]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[42]  D Hennessy,et al.  Statistical methods for the objective design of screening procedures for macromolecular crystallization. , 2000, Acta crystallographica. Section D, Biological crystallography.

[43]  F. Burden,et al.  A quantitative structure--activity relationships model for the acute toxicity of substituted benzenes to Tetrahymena pyriformis using Bayesian-regularized neural networks. , 2000, Chemical research in toxicology.

[44]  S. Brunak,et al.  Prediction of N-terminal protein sorting signals. , 1997, Current opinion in structural biology.

[45]  Rob Stocker,et al.  Applying k-Nearest Neighbour in Diagnosing Heart Disease Patients , 2012 .

[46]  E. Uberbacher,et al.  Locating protein-coding regions in human DNA sequences by a multiple sensor-neural network approach. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[47]  Richard H. Lathrop,et al.  Predicting Protein Structure With Probabilistic Models , 1997 .

[48]  Pieter Adriaans,et al.  Data mining , 1996 .

[49]  C E Lawrence,et al.  Functional classification of cNMP-binding proteins and nucleotide cyclases with implications for novel regulatory pathways in Mycobacterium tuberculosis. , 2000, Genome research.

[50]  Kyuseok Shim,et al.  PUBLIC: A Decision Tree Classifier that Integrates Building and Pruning , 1998, Data Mining and Knowledge Discovery.

[51]  G. Heijne,et al.  ChloroP, a neural network‐based method for predicting chloroplast transit peptides and their cleavage sites , 1999, Protein science : a publication of the Protein Society.

[52]  A. Dunker,et al.  Use of conditional probabilities for determining relationships between amino acid sequence and protein secondary structure , 1992, Proteins.

[53]  Mark F Fillinger,et al.  In vivo analysis of mechanical wall stress and abdominal aortic aneurysm rupture risk. , 2002, Journal of vascular surgery.