Predicting phenotypes of asthma and eczema with machine learning

BackgroundThere is increasing recognition that asthma and eczema are heterogeneous diseases. We investigated the predictive ability of a spectrum of machine learning methods to disambiguate clinical sub-groups of asthma, wheeze and eczema, using a large heterogeneous set of attributes in an unselected population. The aim was to identify to what extent such heterogeneous information can be combined to reveal specific clinical manifestations.MethodsThe study population comprised a cross-sectional sample of adults, and included representatives of the general population enriched by subjects with asthma. Linear and non-linear machine learning methods, from logistic regression to random forests, were fit on a large attribute set including demographic, clinical and laboratory features, genetic profiles and environmental exposures. Outcome of interest were asthma, wheeze and eczema encoded by different operational definitions. Model validation was performed via bootstrapping.ResultsThe study population included 554 adults, 42% male, 38% previous or current smokers. Proportion of asthma, wheeze, and eczema diagnoses was 16.7%, 12.3%, and 21.7%, respectively. Models were fit on 223 non-genetic variables plus 215 single nucleotide polymorphisms. In general, non-linear models achieved higher sensitivity and specificity than other methods, especially for asthma and wheeze, less for eczema, with areas under receiver operating characteristic curve of 84%, 76% and 64%, respectively. Our findings confirm that allergen sensitisation and lung function characterise asthma better in combination than separately. The predictive ability of genetic markers alone is limited. For eczema, new predictors such as bio-impedance were discovered.ConclusionsMore usefully-complex modelling is the key to a better understanding of disease mechanisms and personalised healthcare: further advances are likely with the incorporation of more factors/attributes and longitudinal measures.

[1]  A. Custovic,et al.  17q12-21 variants are associated with asthma and interact with active smoking in an adult population from the United Kingdom. , 2012, Annals of allergy, asthma & immunology : official publication of the American College of Allergy, Asthma, & Immunology.

[2]  A. Custovic,et al.  Quantification of atopy, lung function and airway hypersensitivity in adults , 2011, Clinical and translational allergy.

[3]  A. Woodcock,et al.  NAC Manchester Asthma and Allergy Study (NACMAAS): risk factors for asthma and allergic disorders in adults , 2001, Clinical and experimental allergy : journal of the British Society for Allergy and Clinical Immunology.

[4]  Brian D. Ripley,et al.  Modern Applied Statistics with S Fourth edition , 2002 .

[5]  A. Gulsvik,et al.  Quality‐of‐life and asthma‐severity in general population asthmatics: results of the ECRHS II study , 2008, Allergy.

[6]  S. Gabriel,et al.  The Structure of Haplotype Blocks in the Human Genome , 2002, Science.

[7]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[8]  L. Bacharier,et al.  Diagnosis and management of early asthma in preschool-aged children. , 2012, The Journal of allergy and clinical immunology.

[9]  Thomas Lengauer,et al.  Permutation importance: a corrected feature importance measure , 2010, Bioinform..

[10]  J. Castro‐Rodriguez,et al.  A clinical index to define risk of asthma in young children with recurrent wheezing. , 2000, American journal of respiratory and critical care medicine.

[11]  K. Reginald,et al.  Mite component–specific IgE repertoire and phenotypes of allergic disease in childhood: The tropical perspective , 2011, Pediatric allergy and immunology : official publication of the European Society of Pediatric Allergy and Immunology.

[12]  D. Gold,et al.  Allergen exposure modifies the relation of sensitization to fraction of exhaled nitric oxide levels in children at risk for allergy and asthma. , 2011, The Journal of allergy and clinical immunology.

[13]  A. Woodcock,et al.  Exposure and sensitization to indoor allergens: association with lung function, bronchial reactivity, and exhaled nitric oxide measures in asthma. , 2003, The Journal of allergy and clinical immunology.

[14]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[15]  R. Agius,et al.  A computer based asthma hazard prediction model and new molecular weight agents in occupational asthma , 2012, Occupational and Environmental Medicine.

[16]  J. Vera,et al.  Asthma phenotyping, therapy, and prevention: what can we learn from systems biology? , 2013, Pediatric Research.

[17]  M. Bracken,et al.  Association of pediatric asthma severity with exposure to common household dust allergens. , 2009, Environmental research.

[18]  F. Martinez,et al.  Association of asthma with serum IgE levels and skin-test reactivity to allergens. , 1989, The New England journal of medicine.

[19]  H. Sampson,et al.  Correlation of specific IgE to shrimp with cockroach and dust mite exposure and sensitization in an inner-city population. , 2011, The Journal of allergy and clinical immunology.

[20]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[21]  A. Custovic,et al.  Quantification of atopy and the probability of rhinitis in preschool children: a population‐based birth cohort study , 2007, Allergy.

[22]  Nicholas J Timpson,et al.  Genome-wide prediction of childhood asthma and related phenotypes in a longitudinal birth cohort. , 2012, The Journal of allergy and clinical immunology.

[23]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[24]  William N. Venables,et al.  Modern Applied Statistics with S , 2010 .

[25]  Ashutosh Kumar Singh,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2010 .

[26]  R. Buhl,et al.  Elevation of total serum immunoglobulin E is associated with asthma in nonallergic individuals. , 2000, The European respiratory journal.

[27]  D. Postma,et al.  Predicting who will have asthma at school age among preschool children. , 2012, The Journal of allergy and clinical immunology.

[28]  A. Woodcock,et al.  The National Asthma Campaign Manchester Asthma and Allergy Study , 2002, Pediatric allergy and immunology : official publication of the European Society of Pediatric Allergy and Immunology.

[29]  S. García,et al.  An Extension on "Statistical Comparisons of Classifiers over Multiple Data Sets" for all Pairwise Comparisons , 2008 .

[30]  P. Latzin,et al.  Exhaled nitric oxide in symptomatic children at preschool age predicts later asthma , 2013, Allergy.

[31]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[32]  A. Sandford,et al.  Personalised Medicine and Asthma Diagnostics/Management , 2013, Current Allergy and Asthma Reports.

[33]  M. Akiyama,et al.  Clinical severity correlates with impaired barrier in filaggrin-related eczema. , 2009, The Journal of investigative dermatology.

[34]  Yoshua Bengio,et al.  Inference for the Generalization Error , 1999, Machine Learning.

[35]  H. Bisgaard,et al.  Changes in body water distribution during treatment with inhaled steroid in pre-school children , 2004, Annals of human biology.

[36]  J. Just,et al.  A Simple Tool to Identify Infants at High Risk of Mild to Severe Childhood Asthma: The Persistent Asthma Predictive Score , 2011, The Journal of asthma : official journal of the Association for the Care of Asthma.

[37]  Achim Zeileis,et al.  Bias in random forest variable importance measures: Illustrations, sources and a solution , 2007, BMC Bioinformatics.

[38]  S. Greenberg Asthma exacerbations: predisposing factors and prediction rules , 2013, Current opinion in allergy and clinical immunology.

[39]  Deborah Jarvis,et al.  Prognostic factors of asthma severity: a 9-year international prospective cohort study. , 2006, The Journal of allergy and clinical immunology.

[40]  P. Burney,et al.  Operational definitions of asthma in studies on its aetiology , 2005, European Respiratory Journal.

[41]  M. Nyrén,et al.  On Assessment of Skin Reactivity Using Electrical Impedance a , 1999, Annals of the New York Academy of Sciences.

[42]  J. House,et al.  Links between Pollen, Atopy and the Asthma Epidemic , 2007, International Archives of Allergy and Immunology.

[43]  Ireneous N. Soyiri,et al.  Semistructured black-box prediction: proposed approach for asthma admissions in London , 2012, International journal of general medicine.

[44]  Mark Daly,et al.  Haploview: analysis and visualization of LD and haplotype maps , 2005, Bioinform..

[45]  Adnan Custovic,et al.  Asthma endotypes: a new approach to classification of disease entities within the asthma syndrome. , 2011, The Journal of allergy and clinical immunology.

[46]  Alexandros Rigas,et al.  An Intelligent System Approach for Asthma Prediction in Symptomatic Preschool Children , 2013, Comput. Math. Methods Medicine.

[47]  Eibe Frank,et al.  Logistic Model Trees , 2003, Machine Learning.

[48]  A. Woodcock,et al.  Relationship among pulmonary function, bronchial reactivity, and exhaled nitric oxide in a large group of asthmatic patients. , 2003, Annals of allergy, asthma & immunology : official publication of the American College of Allergy, Asthma, & Immunology.

[49]  D. Lowenthal,et al.  Novel Therapies in Asthma: Leukotriene Antagonists, Biologic Agents, and Beyond , 2013, American journal of therapeutics.

[50]  Carolin Strobl,et al.  The behaviour of random forest permutation-based variable importance measures under predictor correlation , 2010, BMC Bioinformatics.

[51]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.