Objective Prediction of Social Skills Level for Automated Social Skills Training Using Audio and Text Information

Although Social Skills Training is a well-known effective method to obtain appropriate social skills during daily communication, getting such training is difficult due to a shortage of therapists. Therefore, automatic training systems are required to ameliorate this situation. To fairly evaluate social skills, we need an objective evaluation method. In this paper, we utilized the second edition of the Social Responsiveness Scale (SRS-2) as an objective evaluation metric and developed an automatic evaluation system using linear regression with multi-modal features. We newly adopted features including 28 audio features and BERT-based sequential similarity (seq-similarity), which indicates how well the meaning of users remains consistent within their utterances. We achieved a 0.35 Pearson correlation coefficient for the SRS-2's overall score prediction and 0.60 for the social communication score prediction, which is a treatment sub-scale score of SRS-2. This experiment shows that our system can objectively predict the levels of social skills. Please note that we only evaluated the system on healthy subjects since this study is still at the feasibility phase. Therefore, further evaluation of real patients is needed in future work.

[1]  Sidarta Ribeiro,et al.  Graph analysis of verbal fluency test discriminate between patients with Alzheimer's disease, mild cognitive impairment and normal elderly controls , 2014, Front. Aging Neurosci..

[2]  A. Bandura Principles of behavior modification , 1969 .

[3]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .

[4]  Satoshi Nakamura,et al.  Analysis of conversational listening skills toward agent-based social skills training , 2019, Journal on Multimodal User Interfaces.

[5]  J. Wolpe,et al.  Psychotherapy by reciprocal inhibition , 1958, Conditional reflex.

[6]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[7]  D. Eisenberg,et al.  Mental health problems and help-seeking behavior among college students. , 2010, The Journal of adolescent health : official publication of the Society for Adolescent Medicine.

[8]  Satoshi Nakamura,et al.  Embodied conversational agents for multimodal automated social skills training in people with autism spectrum disorders , 2017, PloS one.

[9]  James W. Pennebaker,et al.  Linguistic Inquiry and Word Count (LIWC2007) , 2007 .

[10]  Yuji Matsumoto,et al.  Applying Conditional Random Fields to Japanese Morphological Analysis , 2004, EMNLP.

[11]  Tomoki Toda,et al.  Teaching Social Communication Skills Through Human-Agent Interaction , 2016, TIIS.

[12]  R'emi Louf,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[13]  Kim T. Mueser,et al.  Social skills training: Alive and well? , 2007 .

[14]  Daniel Gildea,et al.  Automated Analysis and Prediction of Job Interview Performance , 2015, IEEE Transactions on Affective Computing.

[15]  Kallirroi Georgila,et al.  SimSensei kiosk: a virtual human interviewer for healthcare decision support , 2014, AAMAS.

[16]  Takayuki Kagomiya Articulatory positions of Japanese vowels as a function of duration computed from a large-scale spontaneous speech corpus , 2015, ICPhS.

[17]  Andrew Salter,et al.  CONDITIONED REFLEX THERAPY , 1952 .

[18]  Brian Roark,et al.  Distributional semantic models for the evaluation of disordered language , 2013, HLT-NAACL.