Logical Analysis of Data (LAD) model for the early diagnosis of acute ischemic stroke

BackgroundStrokes are a leading cause of morbidity and the first cause of adult disability in the United States. Currently, no biomarkers are being used clinically to diagnose acute ischemic stroke. A diagnostic test using a blood sample from a patient would potentially be beneficial in treating the disease.ResultsA classification approach is described for differentiating between proteomic samples of stroke patients and controls, and a second novel predictive model is developed for predicting the severity of stroke as measured by the National Institutes of Health Stroke Scale (NIHSS). The models were constructed by applying the Logical Analysis of Data (LAD) methodology to the mass peak profiles of 48 stroke patients and 32 controls. The classification model was shown to have an accuracy of 75% when tested on an independent validation set of 35 stroke patients and 25 controls, while the predictive model exhibited superior performance when compared to alternative algorithms. In spite of their high accuracy, both models are extremely simple and were developed using a common set consisting of only 3 peaks.ConclusionWe have successfully identified 3 biomarkers that can detect ischemic stroke with an accuracy of 75%. The performance of the classification model on the validation set and on cross-validation does not deteriorate significantly when compared to that on the training set, indicating the robustness of the model. As in the case of the LAD classification model, the results of the predictive model validate the function constructed on our support-set for approximating the severity scores of stroke patients. The correlation and root mean absolute error of the LAD predictive model are consistently superior to those of the other algorithms used (Support vector machines, C4.5 decision trees, Logistic regression and Multilayer perceptron).

[1]  Toshihide Ibaraki,et al.  An Implementation of Logical Analysis of Data , 2000, IEEE Trans. Knowl. Data Eng..

[2]  P. Schellhammer,et al.  Serum protein fingerprinting coupled with a pattern-matching algorithm distinguishes prostate cancer from benign prostate hyperplasia and healthy men. , 2002, Cancer research.

[3]  Peter L. Hammer,et al.  Spanned patterns for the logical analysis of data , 2006, Discret. Appl. Math..

[4]  Peter L. Hammer,et al.  Accelerated algorithm for pattern detection in logical analysis of data , 2006, Discret. Appl. Math..

[5]  Peter L. Hammer,et al.  Maximum patterns in datasets , 2008, Discret. Appl. Math..

[6]  Brian Silver,et al.  A three-item scale for the early prediction of stroke recovery , 2001, The Lancet.

[7]  Jean-Charles Sanchez,et al.  ApoC‐I and ApoC‐III as potential plasmatic markers to distinguish between ischemic and hemorrhagic stroke , 2004, Proteomics.

[8]  Y. Crama,et al.  Cause-effect relationships and partially defined Boolean functions , 1988 .

[9]  E. Petricoin,et al.  Use of proteomic patterns in serum to identify ovarian cancer , 2002, The Lancet.

[10]  Peter L. Hammer,et al.  Logical analysis of data—An overview: From combinatorial optimization to medical applications , 2006, Ann. Oper. Res..

[11]  Peter L. Hammer,et al.  Use of the Logical Analysis of Data Method for Assessing Long-Term Mortality Risk After Exercise Electrocardiography , 2002, Circulation.

[12]  Peter L. Hammer,et al.  Coronary Risk Prediction by Logical Analysis of Data , 2003, Ann. Oper. Res..

[13]  Peter L. Hammer,et al.  Pareto-optimal patterns in logical analysis of data , 2004, Discret. Appl. Math..

[14]  D. Chan,et al.  Proteomics and bioinformatics approaches for identification of serum biomarkers to detect breast cancer. , 2002, Clinical chemistry.

[15]  P. Hammer,et al.  Breast cancer prognosis by combinatorial analysis of gene expression data , 2006, Breast Cancer Research.

[16]  P. Hammer,et al.  Ovarian cancer detection by logical analysis of proteomic data , 2004, Proteomics.

[17]  David A Bennett,et al.  High-resolution serum proteomic profiling of Alzheimer disease samples reveals disease-specific, carrier-protein-bound mass signatures. , 2005, Clinical chemistry.

[18]  Gabriela Alexe,et al.  A computational approach to predicting cell growth on polymeric biomaterials. , 2005, Journal of biomedical materials research. Part A.

[19]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[20]  Toshihide Ibaraki,et al.  Logical analysis of numerical data , 1997, Math. Program..

[21]  Peter L. Hammer,et al.  Logical Analysis of Data: From Combinatorial Optimization to Medical Applications , 2005 .