Enabling Personalized Decision Support with Patient-Generated Data and Attributable Components.

Decision-making related to health is complex. Machine learning (ML) and patient generated data can identify patterns and insights at the individual level, where human cognition falls short, but not all ML-generated information is of equal utility for making health-related decisions. We develop and apply attributable components analysis (ACA), a method inspired by optimal transport theory, to type 2 diabetes self-monitoring data to identify patterns of association between nutrition and blood glucose control. In comparison with linear regression, we found that ACA offers a number of characteristics that make it promising for use in decision support applications. For example, ACA was able to identify non-linear relationships, was more robust to outliers, and offered broader and more expressive uncertainty estimates. In addition, our results highlight a tradeoff between model accuracy and interpretability, and we discuss implications for ML-driven decision support systems.

[1]  Esteban G. Tabak,et al.  Conditional expectation estimation through attributable components , 2018 .

[2]  P. Aschner,et al.  New IDF clinical practice recommendations for managing type 2 diabetes in primary care. , 2017, Diabetes research and clinical practice.

[3]  Bernd Ludwig,et al.  Engendering Health with Recommender Systems , 2016, RecSys.

[4]  C. Ji An Archetypal Analysis on , 2005 .

[5]  P. Schulz,et al.  Mapping mHealth Research: A Decade of Evolution , 2013, Journal of medical Internet research.

[6]  Lena Mamykina,et al.  Personalized glucose forecasting for type 2 diabetes using data assimilation , 2017, PLoS Comput. Biol..

[7]  Y. Jang,et al.  Standards of Medical Care in Diabetes-2010 by the American Diabetes Association: Prevention and Management of Cardiovascular Disease , 2010 .

[8]  Chunhua Weng,et al.  Methods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research , 2013, J. Am. Medical Informatics Assoc..

[9]  Sue Sing Lim,et al.  Choose Your Foods, Food Lists for Diabetes. , 2015 .

[10]  Nicholas Genes,et al.  From smartphone to EHR: a case report on integrating patient-generated health data , 2018, npj Digital Medicine.

[11]  Filippo Santambrogio,et al.  Introduction to optimal transport theory , 2010, Optimal Transport.

[12]  Lena Mamykina,et al.  Mechanistic machine learning: how data assimilation leverages physiologic knowledge using Bayesian inference to forecast the future, infer the present, and phenotype , 2018, J. Am. Medical Informatics Assoc..

[13]  George Hripcsak,et al.  Methodological variations in lagged regression for detecting physiologic drug effects in EHR data , 2018, J. Biomed. Informatics.

[14]  Jorge Cadima,et al.  Principal component analysis: a review and recent developments , 2016, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[15]  Anthony C. Davison,et al.  Bootstrap Methods and Their Application , 1998 .

[16]  George Hripcsak,et al.  Next-generation phenotyping of electronic health records , 2012, J. Am. Medical Informatics Assoc..

[17]  M J Pazzani,et al.  Acceptance of Rules Generated by Machine Learning among Medical Experts , 2001, Methods of Information in Medicine.

[18]  Ralph C. Smith,et al.  Uncertainty Quantification: Theory, Implementation, and Applications , 2013 .

[19]  James W. Anderson,et al.  Carbohydrate and Fiber Recommendations for Individuals with Diabetes: A Quantitative Assessment and Meta-Analysis of the Evidence , 2004, Journal of the American College of Nutrition.

[20]  Ellen Harris,et al.  USDA food and nutrient databases provide the infrastructure for food and nutrition research, policy, and practice. , 2013, The Journal of nutrition.

[21]  Lee Lacy,et al.  Defense Advanced Research Projects Agency (DARPA) Agent Markup Language Computer Aided Knowledge Acquisition , 2005 .

[22]  Gaetano Borriello,et al.  2013 7th International Conference on Pervasive Computing Technologies for Healthcare and Workshops Design and Evaluation of a Food Index-based Nutrition Diary , 2022 .

[23]  E. LESTER SMITH,et al.  AND OTHERS , 2005 .

[24]  Tony Blakely,et al.  Reflection on modern methods: when worlds collide-prediction, machine learning and causal inference. , 2019, International journal of epidemiology.

[25]  Ching-Hua Chen,et al.  Data quality challenges for person-generated health and wellness data , 2018, IBM J. Res. Dev..

[26]  Henrik Boström,et al.  Trade-off between accuracy and interpretability for predictive in silico modeling. , 2011, Future medicinal chemistry.

[27]  Christian Biemann,et al.  What do we need to build explainable AI systems for the medical domain? , 2017, ArXiv.

[28]  James Fogarty,et al.  Rethinking the Mobile Food Journal: Exploring Opportunities for Lightweight Photo-Based Capture , 2015, CHI.

[29]  Federico Cabitza,et al.  A giant with feet of clay: on the validity of the data that feed machine learning in medicine , 2017, Organizing for the Digital World.

[30]  George Hripcsak,et al.  High-fidelity phenotyping: richness and freedom from bias , 2017, J. Am. Medical Informatics Assoc..

[31]  Marc D Breton,et al.  Optimum Subcutaneous Glucose Sampling and Fourier Analysis of Continuous Glucose Monitors , 2008, Journal of diabetes science and technology.

[32]  Lena Mamykina,et al.  Data-driven health management: reasoning about personally generated data in diabetes with information technologies , 2016, J. Am. Medical Informatics Assoc..

[33]  Lena Mamykina,et al.  A visual analytics approach for pattern-recognition in patient-generated data , 2018, J. Am. Medical Informatics Assoc..

[34]  David A. Gough,et al.  Frequency Characterization of Blood Glucose Dynamics , 2004, Annals of Biomedical Engineering.

[35]  E. Tabak,et al.  Prototypal Analysis and Prototypal Regression , 2017, 1701.08916.

[36]  Faramarz Ismail-Beigi,et al.  Clinical practice. Glycemic management of type 2 diabetes mellitus. , 2012, The New England journal of medicine.

[37]  4. Lifestyle Management: Standards of Medical Care in Diabetes—2018 , 2017, Diabetes Care.

[38]  W. Mendenhall,et al.  A Second Course in Statistics: Regression Analysis , 1996 .

[39]  E. Tabak,et al.  Dynamical Phenotyping: Using Temporal Analysis of Clinically Collected Physiologic Data to Stratify Populations , 2014, PloS one.

[40]  E. Segal,et al.  Personalized Nutrition by Prediction of Glycemic Responses , 2015, Cell.

[41]  C. Villani Optimal Transport: Old and New , 2008 .

[42]  Deborah Estrin,et al.  Yum-Me: A Personalized Nutrient-Based Meal Recommender System , 2016, ACM Trans. Inf. Syst..

[43]  Gregory D. Abowd,et al.  Barriers and Negative Nudges: Exploring Challenges in Food Journaling , 2015, CHI.

[44]  Jost Reinecke,et al.  Analysis of Change: Advanced Techniques in Panel Data Analysis , 1996 .