Deep Neural Architectures for Discourse Segmentation in E-Mail Based Behavioral Interventions.

Communication science approaches to develop effective behavior interventions, such as motivational interviewing (MI), are limited by traditional qualitative coding of communication exchanges, a very resource-intensive and time-consuming process. This study focuses on the analysis of e-Coaching sessions, behavior interventions delivered via email and grounded in the principles of MI. A critical step towards automated qualitative coding of e-Coaching sessions is segmentation of emails into fragments that correspond to MI behaviors. This study frames email segmentation task as a classification problem and utilizes word and punctuation mark embeddings in conjunction with part-of-speech features to address it. We evaluated the performance of conditional random fields (CRF) as well as multi-layer perceptron (MLP), bi-directional recurrent neural network (BRNN) and convolutional recurrent neural network (CRNN) for the task of email segmentation. Our results indicate that CRNN outperforms CRF, MLP and BRNN achieving 0.989 weighted macro-averaged F1-measure and 0.825 F1-measure for new segment detection.

[1]  Naoaki Okazaki,et al.  Identifying Sections in Scientific Abstracts using Conditional Random Fields , 2008, IJCNLP.

[2]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[3]  Kjersti Aas,et al.  Text Categorisation: A Survey , 1999 .

[4]  Sebastian Thrun,et al.  Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.

[5]  Stefan Schulz,et al.  Detection of sentence boundaries and abbreviations in clinical narratives , 2015, BMC Medical Informatics and Decision Making.

[6]  Richard Longabaugh,et al.  Mechanisms of change in motivational interviewing: a review and preliminary evaluation of the evidence. , 2009, Addiction.

[7]  Christopher D. Manning,et al.  Baselines and Bigrams: Simple, Good Sentiment and Topic Classification , 2012, ACL.

[8]  Bonnie L. Webber,et al.  Discourse structure and language technology , 2011, Natural Language Engineering.

[9]  Ming Dong,et al.  A study of the effectiveness of machine learning methods for classification of clinical interview fragments into a large number of categories , 2016, J. Biomed. Informatics.

[10]  Ming Dong,et al.  Interpretable Probabilistic Latent Variable Models for Automatic Annotation of Clinical Text , 2015, AMIA.

[11]  Geoffrey E. Hinton,et al.  Learning representations by back-propagation errors, nature , 1986 .

[12]  Dina Demner-Fushman,et al.  Automatic segmentation of clinical texts , 2009, 2009 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[13]  Eric Fosler-Lussier,et al.  Discourse Segmentation of Multi-Party Conversation , 2003, ACL.

[14]  Eric Fosler-Lussier,et al.  A Quantitative and Qualitative Evaluation of Sentence Boundary Detection for the Clinical Domain , 2016, CRI.

[15]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[16]  Ellen Barton,et al.  Provider Communication Behaviors that Predict Motivation to Change in Black Adolescents with Obesity , 2013, Journal of developmental and behavioral pediatrics : JDBP.

[17]  Sandra M. Aluísio,et al.  Sentence Segmentation in Narrative Transcripts from Neuropsychological Tests using Recurrent Convolutional Neural Networks , 2016, EACL.

[18]  Andreas Stolcke,et al.  Using Conditional Random Fields for Sentence Boundary Detection in Speech , 2005, ACL.

[19]  G. Tober Motivational Interviewing: Helping People Change , 2013 .

[20]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[21]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[22]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[23]  Ricky K. Taira,et al.  Text Boundary Detection of Medical Reports , 2002, AMIA.

[24]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[25]  Gwen L. Alexander,et al.  Motivations of Young Adults for Improving Dietary Choices: Focus Group Findings Prior to the MENU GenY Dietary Change Trial , 2018, Health education & behavior : the official publication of the Society for Public Health Education.

[26]  Ming Dong,et al.  Predicting the Outcome of Patient-Provider Communication Sequences using Recurrent Neural Networks and Probabilistic Models , 2018, AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science.

[27]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[28]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[29]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[30]  Randolph A. Miller,et al.  Research Paper: Evaluation of a Method to Identify and Categorize Section Headers in Clinical Documents , 2009, J. Am. Medical Informatics Assoc..

[31]  W. Miller,et al.  Toward a theory of motivational interviewing. , 2009, The American psychologist.

[32]  Lucy Vanderwende,et al.  Statistical Section Segmentation in Free-Text Clinical Records , 2012, LREC.