Towards end-2-end learning for predicting behavior codes from spoken utterances in psychotherapy conversations
暂无分享,去创建一个
Shrikanth S. Narayanan | David C. Atkins | Karan Singla | Zhuohao Chen | Karan Singla | Zhuohao Chen
[1] Hagen Soltau,et al. Neural Speech Recognizer: Acoustic-to-Word LSTM Model for Large Vocabulary Speech Recognition , 2016, INTERSPEECH.
[2] Andreas Stolcke,et al. Recurrent neural network and LSTM models for lexical utterance classification , 2015, INTERSPEECH.
[3] Shrikanth Narayanan,et al. Improving the Prediction of Therapist Behaviors in Addiction Counseling by Exploiting Class Confusions , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[4] Verónica Pérez-Rosas,et al. Predicting Counselor Behaviors in Motivational Interviewing Encounters , 2017, EACL.
[5] Vivek Srikumar,et al. Observing Dialogue in Therapy: Categorizing and Forecasting Behavioral Codes , 2019, ACL.
[6] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[7] Yongqiang Wang,et al. Towards End-to-end Spoken Language Understanding , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] Prateek Verma,et al. Audio-linguistic Embeddings for Spoken Sentences , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Keikichi Hirose,et al. Prosodic word boundary detection using statistical modeling of moraic fundamental frequency contours and its use for continuous speech recognition , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).
[10] James R. Glass,et al. Speech2Vec: A Sequence-to-Sequence Framework for Learning Word Embeddings from Speech , 2018, INTERSPEECH.
[11] Christopher Joseph Pal,et al. Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning , 2018, ICLR.
[12] Jean-Claude Junqua,et al. A robust algorithm for word boundary detection in the presence of noise , 1994, IEEE Trans. Speech Audio Process..
[13] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[14] Chong Wang,et al. Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin , 2015, ICML.
[15] Shrikanth S. Narayanan,et al. Using Prosodic and Lexical Information for Learning Utterance-level Behaviors in Psychotherapy , 2018, INTERSPEECH.
[16] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[17] Panayiotis G. Georgiou,et al. Robust word boundary detection in spontaneous speech using acoustic and lexical cues , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[18] Ruhi Sarikaya,et al. Contextual domain classification in spoken language understanding systems using recurrent neural network , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[20] David C. Atkins,et al. A Comparison of Natural Language Processing Methods for Automated Coding of Motivational Interviewing. , 2016, Journal of substance abuse treatment.
[21] Diyi Yang,et al. Hierarchical Attention Networks for Document Classification , 2016, NAACL.
[22] Quoc V. Le,et al. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[23] Panayiotis G. Georgiou,et al. Behavioral Coding of Therapist Language in Addiction Counseling Using Recurrent Neural Networks , 2016, INTERSPEECH.
[24] Gokhan Tur,et al. Spoken Language Understanding: Systems for Extracting Semantic Information from Speech , 2011 .
[25] Yoshua Bengio,et al. Speech Model Pre-training for End-to-End Spoken Language Understanding , 2019, INTERSPEECH.
[26] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[27] David Suendermann-Oeft,et al. Exploring ASR-free end-to-end modeling to improve spoken language understanding in a cloud-based dialog system , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[28] Panayiotis G. Georgiou,et al. Multi-Label Multi-Task Deep Learning for Behavioral Coding , 2018, IEEE Transactions on Affective Computing.
[29] Geoffrey Zweig,et al. Recurrent neural networks for language understanding , 2013, INTERSPEECH.