Predicting lymphoma outcomes and risk factors in patients with primary Sjögren’s Syndrome using gradient boosting tree ensembles

Primary Sjogren’s Syndrome (pSS) is a chronic autoimmune disease followed by exocrine gland dysfunction, where it has been long stated that 5% of pSS patients are prone to lymphoma development. In this work, we use clinical data from 449 pSS patients to develop a first, rule-based, supervised learning model that can be used to predict lymphoma outcomes, as well as, identify prominent features for lymphoma prediction in pSS patients. Towards this direction, the gradient boosting method combined with regression tree ensembles is used to derive a rule-based, decision model for lymphoma prediction. Our results reveal an average accuracy 87.1% and area under the curve score 88%, highlighting the importance of the C4 value, the rheumatoid factor and the lymphadenopathy factor as prominent lymphoma predictors, among others.

[1]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[2]  D. Rubin,et al.  Statistical Analysis with Missing Data , 1988 .

[3]  C. Avellini,et al.  Characterization of prelymphomatous stages of B cell lymphoproliferation in Sjögren's syndrome. , 1997, Arthritis and rheumatism.

[4]  Trevor Hastie,et al.  Additive Logistic Regression : a Statistical , 1998 .

[5]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[6]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[7]  J. Ioannidis,et al.  Long-term risk of mortality and lymphoproliferative disease and predictive classification of primary Sjögren's syndrome. , 2002, Arthritis and rheumatism.

[8]  N. Nachar The Mann ‐ Whitney U: A Test for Assessing Whether Two Independent Samples Come from the Same Distribution , 2007 .

[9]  I. Dahabreh,et al.  Hematologic Manifestations and Predictors of Lymphoma Development in Primary Sjögren Syndrome: Clinical and Pathophysiologic Aspects , 2009, Medicine.

[10]  H. Moutsopoulos,et al.  Characteristics of the minor salivary gland infiltrates in Sjögren's syndrome. , 2010, Journal of autoimmunity.

[11]  R. Jonsson,et al.  Lymphoid organisation in labial salivary gland biopsies is a possible predictor for the development of malignant lymphoma in primary Sjögren's syndrome , 2011, Annals of the rheumatic diseases.

[12]  M. Vilardell‐Tarrés,et al.  Risk, predictors, and clinical characteristics of lymphoma development in primary Sjögren's syndrome. , 2011, Seminars in arthritis and rheumatism.

[13]  Mia Hubert,et al.  Robust statistics for outlier detection , 2011, WIREs Data Mining Knowl. Discov..

[14]  A. Ghasemi,et al.  Normality Tests for Statistical Analysis: A Guide for Non-Statisticians , 2012, International journal of endocrinology and metabolism.

[15]  H. Moutsopoulos,et al.  Sjögren syndrome , 2014, Canadian Medical Association Journal.

[16]  A. Tzioufas,et al.  Clinical picture, outcome and predictive factors of lymphoma in Sjӧgren syndrome. , 2015, Autoimmunity reviews.

[17]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[18]  T. Radstake,et al.  Towards standardisation of histopathological assessments of germinal centres and lymphoid structures in primary Sjögren's syndrome , 2016, Annals of the rheumatic diseases.

[19]  X. Mariette,et al.  Rheumatoid Factor and Disease Activity Are Independent Predictors of Lymphoma in Primary Sjögren's Syndrome , 2016, Arthritis & rheumatology.

[20]  H. Moutsopoulos,et al.  Predicting the risk for lymphoma development in Sjogren syndrome , 2016, Medicine.

[21]  Dimitrios I. Fotiadis,et al.  Towards the Establishment of a Biomedical Ontology for the Primary Sjögren’s Syndrome , 2018, 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[22]  S. Clair,et al.  Recent advances in primary Sjogren ' s syndrome , 2018 .

[23]  G. Appa Rao,et al.  Characteristic mining of Mathematical Formulas from Document - A Comparative Study on Sequence Matcher and Levenshtein Distance procedure , 2018 .