Discriminative Joint Modeling of Lexical Variation and Acoustic Confusion for Automated Narrative Retelling Assessment

Automatically assessing the fidelity of a retelling to the original narrative ‐ a task of growing clinical importance ‐ is challenging, given extensive paraphrasing during retelling along with cascading automatic speech recognition (ASR) errors. We present a word tagging approach using conditional random fields (CRFs) that allows a diversity of features to be considered during inference, including some capturing acoustic confusions encoded in word confusion networks. We evaluate the approach under several scenarios, including both supervised and unsupervised training, the latter achieved by training on the output of a baseline automatic word-alignment model. We also adapt the ASR models to the domain, and evaluate the impact of error rate on performance. We find strong robustness to ASR errors, even using just the 1-best system output. A hybrid approach making use of both automatic alignment and CRFs trained tagging models achieves the best performance, yielding strong improvements over using either approach alone.

[1]  Albert T. Corbett,et al.  Mining Free-form Spoken Responses to Tutor Prompts , 2008, EDM.

[2]  C. Lord Autism diagnostic observation schedule : ADOS manual , 2003 .

[3]  Fernando Pereira,et al.  Shallow Parsing with Conditional Random Fields , 2003, NAACL.

[4]  Klaus Zechner,et al.  Exploring Content Features for Automated Speech Scoring , 2012, HLT-NAACL.

[5]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[6]  Brian Roark,et al.  Graph-based alignment of narratives for automated neurological assessment , 2012, BioNLP@HLT-NAACL.

[7]  Andreas Stolcke,et al.  Finding consensus in speech recognition: word error minimization and other applications of confusion networks , 2000, Comput. Speech Lang..

[8]  John O. Willis,et al.  NEPSY: A Developmental Neuropsychological Assessment , 2008 .

[9]  J. Morris The Clinical Dementia Rating (CDR) , 1993, Neurology.

[10]  Dan Roth,et al.  Design Challenges and Misconceptions in Named Entity Recognition , 2009, CoNLL.

[11]  Bhuvana Ramabhadran,et al.  Leveraging word confusion networks for named entity modeling and detection from conversational telephone speech , 2012, Speech Commun..

[12]  Ben Taskar,et al.  Alignment by Agreement , 2006, NAACL.

[13]  A. Wallin,et al.  The Goteborg MCI study: mild cognitive impairment is a heterogeneous condition , 2005, Journal of Neurology, Neurosurgery & Psychiatry.

[14]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[15]  Brian Roark,et al.  Fully Automated Neuropsychological Assessment for Detecting Mild Cognitive Impairment , 2012, INTERSPEECH.

[16]  Marit Korkman,et al.  Specific learning disabilities and difficulties in children and adolescents: Developmental assessment of neuropsychological function with the aid of the NEPSY , 2001 .

[17]  Jiang-Ning Zhou,et al.  Retrieval and encoding of episodic memory in normal aging and patients with mild cognitive impairment , 2002, Brain Research.

[18]  Ron Dumont,et al.  Wechsler Memory Scale–Third Edition , 2008 .

[19]  J. Touchon,et al.  Mild cognitive impairment: conceptual basis and current nosological status , 2000, The Lancet.

[20]  E. Tangalos,et al.  Mild Cognitive Impairment Clinical Characterization and Outcome , 1999 .

[21]  R D Hill,et al.  Very mild senile dementia of the Alzheimer type. II. Psychometric test performance. , 1989, Archives of neurology.

[22]  Brian Kingsbury,et al.  Boosted MMI for model and feature-space discriminative training , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[23]  Brian Roark,et al.  Alignment of spoken narratives for automated neuropsychological assessment , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.