Machine-learning based knowledge discovery in rheumatoid arthritis related registry data to identify predictors of persistent pain.

Early detection of patients with chronic diseases at risk of developing persistent pain is clinically desirable for timely initiation of multimodal therapies. Quality follow-up registries may provide the necessary clinical data; however, their design is not focused on a specific research aim, which poses challenges on the data-analysis strategy. Here, machine-learning was used to identify early parameters that provide information about a future development of persistent pain in rheumatoid arthritis (RA). Data of 288 patients were queried from a registry based on the Swedish Epidemiological Investigation of RA (EIRA). Unsupervised machine-learning identified three distinct patient subgroups (low, median and high) persistent pain intensities. Next, supervised machine learning, implemented as random forests followed by computed ABC analysis-based item categorization, was used to select predictive parameters among 21 different demographic, patient rated and objective clinical factors. The selected parameters were used to train machine-learned algorithms to assign patients pain-related subgroups (1,000 random resamplings, 2/3 training, 1/3 test data). Algorithms trained with three-month data of patient global assessment and health assessment questionnaire provided pain group assignment at a balanced accuracy of 70 %. When restricting the predictors to objective clinical parameters of disease severity, swollen joint count and tender joint count acquired at three months provided a balanced accuracy of rheumatoid arthritis of 59 %. Results indicate that machine-learning is suited to extract knowledge from data queried from pain and disease related registries. Early functional parameters of RA are informative for the development and degree of persistent pain.

[1]  M. Wörnert,et al.  Addition of infliximab compared with addition of sulfasalazine and hydroxychloroquine to methotrexate in patients with early rheumatoid arthritis (Swefot trial): 1-year results of a randomised trial , 2009, The Lancet.

[2]  J. R. Quinlan Induction of decision trees , 2004, Machine Learning.

[3]  Claus Weihs,et al.  klaR Analyzing German Business Cycles , 2005, Data Analysis and Decision Support.

[4]  H. Akaike A new look at the statistical model identification , 1974 .

[5]  O. Førre,et al.  The incidence and severity of rheumatoid arthritis, results from a county register in Oslo, Norway. , 1998, The Journal of rheumatology.

[6]  N. Olsen,et al.  Sexual dimorphism of RA manifestations: genes, hormones and behavior , 2011, Nature Reviews Rheumatology.

[7]  Yvonne C. Lee,et al.  Non-inflammatory Causes of Pain in Patients with Rheumatoid Arthritis , 2016, Current Rheumatology Reports.

[8]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[9]  A. Ultsch,et al.  Functional Abstraction as a Method to Discover Knowledge in Gene Ontologies , 2014, PloS one.

[10]  Yoav Freund,et al.  A Short Introduction to Boosting , 1999 .

[11]  Jörn Lötsch,et al.  Random Forests Followed by Computed ABC Analysis as a Feature Selection Method for Machine Learning in Biomedical Data , 2020 .

[12]  Harlan M. Krumholz,et al.  Reevaluating the Efficacy and Predictability of Antidepressant Treatments: A Symptom Clustering Approach , 2017, JAMA psychiatry.

[13]  F. Wolfe,et al.  Assessment of pain in rheumatoid arthritis: minimal clinically significant difference, predictors, and the effect of anti-tumor necrosis factor therapy. , 2007, The Journal of rheumatology.

[14]  Marcia K. Johnson,et al.  Cross-trial prediction of treatment outcome in depression: a machine learning approach. , 2016, The lancet. Psychiatry.

[15]  Alfred Ultsch,et al.  Computed ABC Analysis for Rational Selection of Most Informative Variables in Multivariate Data , 2015, PloS one.

[16]  H. Schaible,et al.  The role of proinflammatory cytokines in the generation and maintenance of joint pain , 2010, Annals of the New York Academy of Sciences.

[17]  J. Canton An Essay towards solving a Problem in the Doctrine of Chances . By the late Rev . Mr . Bayes , communicated by Mr . Price , in a letter to , 1999 .

[18]  U. Kaiser,et al.  Schmerzregister und verwandte Datensammlungen , 2016, Der Schmerz.

[19]  Jeffrey M. Hausdorff,et al.  Model-based and Model-free Machine Learning Techniques for Diagnostic Prediction and Classification of Clinical Outcomes in Parkinson’s Disease , 2018, Scientific Reports.

[20]  Jörn Lötsch,et al.  Identification of Molecular Fingerprints in Human Heat Pain Thresholds by Use of an Interactive Mixture Model R Toolbox (AdaptGauss) , 2015, International journal of molecular sciences.

[21]  V. Strand,et al.  Patient Expectations and Perceptions of Goal-setting Strategies for Disease Management in Rheumatoid Arthritis , 2015, The Journal of Rheumatology.

[22]  J. Henry,et al.  Neuropathic pain as a process: reversal of chronification in an animal model , 2011, Journal of pain research.

[23]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[24]  Karl Pearson F.R.S. LIII. On lines and planes of closest fit to systems of points in space , 1901 .

[25]  Student,et al.  THE PROBABLE ERROR OF A MEAN , 1908 .

[26]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[27]  D. Walsh,et al.  Pain mechanisms in rheumatoid arthritis. , 2017, Clinical and experimental rheumatology.

[28]  C. Eccleston,et al.  Management of chronic pain in older adults , 2015, BMJ : British Medical Journal.

[29]  L. Alfredsson,et al.  Remaining Pain in Early Rheumatoid Arthritis Patients Treated With Methotrexate , 2015, Arthritis care & research.

[30]  L. Alfredsson,et al.  Non-participation in EIRA: a population-based case–control study of rheumatoid arthritis , 2010, Scandinavian journal of rheumatology.

[31]  C. Spearman The proof and measurement of association between two things. By C. Spearman, 1904. , 1987, The American journal of psychology.

[32]  S. Bergman,et al.  Chronic Widespread Pain in Patients with Rheumatoid Arthritis and the Relation Between Pain and Disease Activity Measures over the First 5 Years , 2013, The Journal of Rheumatology.

[33]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[34]  José Manuel Benítez,et al.  Neural Networks in R Using the Stuttgart Neural Network Simulator: RSNNS , 2012 .

[35]  T. Bayes LII. An essay towards solving a problem in the doctrine of chances. By the late Rev. Mr. Bayes, F. R. S. communicated by Mr. Price, in a letter to John Canton, A. M. F. R. S , 1763, Philosophical Transactions of the Royal Society of London.

[36]  Vladimir Vapnik,et al.  Support-vector networks , 2004, Machine Learning.

[37]  L. Kristensen,et al.  Patient‐Reported Outcomes Are More Important Than Objective Inflammatory Markers for Sick Leave in Biologics‐Treated Patients With Rheumatoid Arthritis , 2018, Arthritis care & research.

[38]  H. Schaible Mechanisms of Chronic Pain in Osteoarthritis , 2012, Current Rheumatology Reports.

[39]  M. Weisman,et al.  Updated consensus statement on biological agents for the treatment of rheumatic diseases, 2010 , 2011, Annals of the rheumatic diseases.

[40]  H. B. Mann,et al.  On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other , 1947 .

[41]  Jörn Lötsch,et al.  A machine-learned knowledge discovery method for associating complex phenotypes with complex genotypes. Application to pain , 2013, J. Biomed. Informatics.

[42]  H. B. Hammer,et al.  Discordance between tender and swollen joint count as well as patient's and evaluator's global assessment may reduce likelihood of remission in patients with rheumatoid arthritis and psoriatic arthritis: data from the prospective multicentre NOR-DMARD study , 2016, Annals of the rheumatic diseases.

[43]  M. Ahlmén,et al.  The influence of sex on rheumatoid arthritis: a prospective study of onset and outcome after 2 years. , 2004, The Journal of rheumatology.

[44]  D. Prvulovic,et al.  A Data Science-Based Analysis Points at Distinct Patterns of Lipid Mediator Plasma Concentrations in Patients With Dementia , 2019, Front. Psychiatry.

[45]  Vasant Dhar,et al.  Data science and prediction , 2012, CACM.

[46]  M. Lunt,et al.  Baseline patient reported outcomes are more consistent predictors of long-term functional disability than laboratory, imaging or joint count data in patients with early inflammatory arthritis: A systematic review , 2018, Seminars in arthritis and rheumatism.

[47]  N. Smirnov Table for Estimating the Goodness of Fit of Empirical Distributions , 1948 .

[48]  C. Bombardier,et al.  Minimum clinically important improvement and patient acceptable symptom state in pain and function in rheumatoid arthritis, ankylosing spondylitis, chronic back pain, hand osteoarthritis, and hip and knee osteoarthritis: Results from a prospective multinational study , 2012, Arthritis care & research.

[49]  A. M. Harvey Classification of Chronic Pain—Descriptions of Chronic Pain Syndromes and Definitions of Pain Terms , 1995 .

[50]  Alfred Ultsch,et al.  Pareto Density Estimation: A Density Estimation for Knowledge Discovery , 2005 .

[51]  A. Shevchenko,et al.  Enlightening discriminative network functional modules behind Principal Component Analysis separation in differential-omic science studies , 2017, Scientific Reports.

[52]  P. Taylor,et al.  A structured literature review of the burden of illness and unmet needs in patients with rheumatoid arthritis: a current perspective , 2016, Rheumatology International.

[53]  Johanne Martel-Pelletier,et al.  Role of proinflammatory cytokines in the pathophysiology of osteoarthritis , 2011, Nature Reviews Rheumatology.

[54]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[55]  Wei-Yin Loh,et al.  Fifty Years of Classification and Regression Trees , 2014 .

[56]  J. Askling,et al.  The Swedish Rheumatology Quality Register: optimisation of rheumatic disease assessments using register-enriched data. , 2014, Clinical and experimental rheumatology.

[57]  A. Ultsch,et al.  Quantitative sensory testing response patterns to capsaicin- and ultraviolet-B–induced local skin hypersensitization in healthy subjects: a machine-learned analysis , 2017, Pain.

[58]  J. Swets The Relative Operating Characteristic in Psychology , 1973, Science.

[59]  Blair H. Smith,et al.  A classification of chronic pain for ICD-11 , 2015, Pain.

[60]  S. Hider,et al.  Fibromyalgia in patients with rheumatoid arthritis: driven by depression or joint damage? , 2011, Clinical and experimental rheumatology.

[61]  W. Kruskal,et al.  Use of Ranks in One-Criterion Variance Analysis , 1952 .

[62]  P. V. van Riel,et al.  Remission in rheumatoid arthritis: agreement of the disease activity score (DAS28) with the ARA preliminary remission criteria. , 2004, Rheumatology.

[63]  Borigini Mj,et al.  Innovative treatment approaches for rheumatoid arthritis. Combination therapy. , 1995 .

[64]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[65]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[66]  Abhijit Ghatak,et al.  Deep Learning with R , 2019, Springer Singapore.

[67]  Ulf Lindblom,et al.  CLASSIFICATION OF CHRONIC PAIN , 2004 .

[68]  P. V. van Riel,et al.  The Disease Activity Score and the EULAR response criteria. , 2005, Clinical and experimental rheumatology.

[69]  Daniel A. Ashlock,et al.  Evolutionary computation for modeling and optimization , 2005 .

[70]  Kurt Hornik,et al.  kernlab - An S4 Package for Kernel Methods in R , 2004 .

[71]  Alfred Ultsch,et al.  Machine learning in pain research , 2017, Pain.

[72]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[73]  Defining the scientific method , 2009, Nature Methods.

[74]  A. Barton,et al.  Rheumatoid arthritis , 2018, Nature Reviews Disease Primers.

[75]  LarrañagaPedro,et al.  A review of feature selection techniques in bioinformatics , 2007 .

[76]  H. Bliddal,et al.  Is Swollen to Tender Joint Count Ratio a New and Useful Clinical Marker for Biologic Drug Response in Rheumatoid Arthritis? Results From a Swedish Cohort , 2014, Arthritis care & research.

[77]  M. Liang,et al.  The American Rheumatism Association 1987 revised criteria for the classification of rheumatoid arthritis. , 1988, Arthritis and rheumatism.

[78]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[79]  A. Ultsch,et al.  Machine-learning based lipid mediator serum concentration patterns allow identification of multiple sclerosis patients with high accuracy , 2018, Scientific Reports.

[80]  M. Calnan,et al.  What outcomes from pharmacologic treatments are important to people with rheumatoid arthritis? Creating the basis of a patient core set , 2010, Arthritis care & research.

[81]  A. Ultsch,et al.  Multimodal Distribution of Human Cold Pain Thresholds , 2015, PloS one.

[82]  M. Yunus The Prevalence of Fibromyalgia in Other Chronic Pain Conditions , 2011, Pain research and treatment.

[83]  J. Pergolizzi,et al.  Treatment Considerations for Cancer Pain: A Global Perspective , 2015, Pain practice : the official journal of World Institute of Pain.

[84]  L. Guttman Some necessary conditions for common-factor analysis , 1954 .

[85]  Tony Wild,et al.  Best Practice in Inventory Management , 1997 .

[86]  L. Alfredsson,et al.  Quantification of the influence of cigarette smoking on rheumatoid arthritis: results from a population based case-control study, using incident cases , 2003, Annals of the rheumatic diseases.