A probabilistic semantic analysis of eHealth scientific literature

Introduction eHealth emerged as an interdisciplinary research area about 70 years ago. This study employs probabilistic techniques to semantically analyse scientific literature related to the field of eHealth in order to identify topics and trends and discuss their comparative evolution. Methods Authors collected titles and abstracts of published literature on eHealth as indexed in PubMed. Basic statistical and bibliometric techniques were applied to overall describe the collected corpus; Latent Dirichlet Allocation was employed for unsupervised topics identification; topics trends analysis was performed, and correlation graphs were plotted were relevant. Results A total of 30,425 records on eHealth were retrieved from PubMed (all records till 31 December 2017, search on 8 May 2018) and 23,988 of these were included to the study corpus. eHealth domain shows a growth higher than the growth of the entire PubMed corpus, with a mean increase of eHealth corpus proportion of about 7% per year for the last 20 years. Probabilistic topics modelling identified 100 meaningful topics, which were organised by the authors in nine different categories: general; service model; disease; medical specialty; behaviour and lifestyle; education; technology; evaluation; and regulatory issues. Discussion Trends analysis shows a continuous shift in focus. Early emphasis on medical image transmission and system integration has been replaced by increased focus on standards, wearables and sensor devices, now giving way to mobile applications, social media and data analytics. Attention on disease is also shifting, from initial popularity of surgery, trauma and acute heart disease, to the emergence of chronic disease support, and the recent attention to cancer, infectious disease, mental disorders, paediatrics and perinatal care; most interestingly the current swift increase is in research related to lifestyle and behaviour change. The steady growth of all topics related to assessment and various systematic evaluation techniques indicates a maturing research field that moves towards real world application.

[1]  Timothy Baldwin,et al.  On-line Trend Analysis with Topic Models: #twitter Trends Detection Topic Model Online , 2012, COLING.

[2]  Michele Angelaccio,et al.  Remote Patient Monitoring via Non-Invasive Digital Technologies: A Systematic Review , 2017, Telemedicine journal and e-health : the official journal of the American Telemedicine Association.

[3]  Joseph L. Austerweil,et al.  Analyzing the history of Cognition using Topic Models , 2015, Cognition.

[4]  Sisira Edirippulige,et al.  Telemedicine - A bibliometric and content analysis of 17, 932 publication records , 2014, Int. J. Medical Informatics.

[5]  P. Jaccard THE DISTRIBUTION OF THE FLORA IN THE ALPINE ZONE.1 , 1912 .

[6]  Joel J. P. C. Rodrigues,et al.  Mobile-health: A review of current state in 2015 , 2015, J. Biomed. Informatics.

[7]  Usman Iqbal,et al.  Trends in the growth of literature of telemedicine: A bibliometric analysis , 2015, Comput. Methods Programs Biomed..

[8]  R. Evans,et al.  Visualizing Collaboration Characteristics and Topic Burst on International Mobile Health Research: Bibliometric Analysis , 2018, JMIR mHealth and uHealth.

[9]  M. Rigla Smart Telemedicine Support for Continuous Glucose Monitoring: The Embryo of a Future Global Agent for Diabetes Care , 2011, Journal of diabetes science and technology.

[10]  Elissa R. Weitzman,et al.  Innovations in health information technologies for chronic pulmonary diseases , 2016, Respiratory Research.

[11]  Waleed M. Sweileh,et al.  Bibliometric analysis of worldwide scientific literature in mobile - health: 2006–2016 , 2017, BMC Medical Informatics and Decision Making.

[12]  B. Granger,et al.  Racial Disparities and the Use of Technology for Self-Management in Blacks with Heart Failure: A Literature Review , 2014, Current Heart Failure Reports.

[13]  Jianzhong Zheng,et al.  Visualization maps for the evolution of research hotspots in the field of regional health information networks , 2018, Informatics for health & social care.

[14]  Nuno Vasconcelos,et al.  Latent Dirichlet Allocation Models for Image Classification , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  David W. Binkley,et al.  Understanding LDA in source code analysis , 2014, ICPC 2014.

[16]  Chong Wang,et al.  Reading Tea Leaves: How Humans Interpret Topic Models , 2009, NIPS.

[17]  Vincenzo Della Mea,et al.  A Bibliometric Analysis , 2010 .

[18]  Tim Menzies,et al.  What is wrong with topic modeling? And how to fix it using search-based software engineering , 2016, Inf. Softw. Technol..

[19]  R. Wootton,et al.  Telemedicine, telehealth or e-health? A bibliometric analysis of the trends in the use of these terms , 2012, Journal of telemedicine and telecare.

[20]  George Drosatos,et al.  Topics and Trends Analysis in eHealth Literature , 2017 .

[21]  Mathieu Bastian,et al.  Gephi: An Open Source Software for Exploring and Manipulating Networks , 2009, ICWSM.

[22]  David M. Blei,et al.  Probabilistic topic models , 2012, Commun. ACM.

[23]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[24]  Heinz Hauffe,et al.  Publication output in telemedicine during the period January 1964 to July 2003 , 2004, Journal of telemedicine and telecare.

[25]  Liezl Van Dyk,et al.  A Review of Telehealth Service Implementation Frameworks , 2014, International journal of environmental research and public health.

[26]  Robert Krovetz,et al.  Viewing morphology as an inference process , 1993, Artif. Intell..

[27]  Antonino Fiannaca,et al.  Probabilistic topic modeling for the analysis and classification of genomic sequences , 2015, BMC Bioinformatics.