Context and Culture affect the Psychometrics of Questionnaires evaluating Speech-based Assistants

Intelligent Personal Assistants (IPAs) have grown into technologically mature systems. However, instruments used for evaluating the usability and user experience of IPAs were developed two decades ago. This bears the danger for research and development to apply inadequate measurements to a novel technology. In this study, more recent scales from human-robot-interaction were used for evaluating speech-based assistants in vehicles in a Chinese sample. However, it cannot be assumed, that adapting a questionnaire from another context and culture leads to objective, reliable and valid measurements. Therefore, data was examined regarding internal consistency and factor structure. Cronbach's alpha was considerably high. However, a factor analysis did not support the assumed four-factor structure but rather a two-factorial solution. Findings suggest that adapting a questionnaire from a different context and culture, can affect its psychometrics. Consequently, the underlying postulated constructs should be treated with caution.

[1]  Debajyoti Pal,et al.  Usability Evaluation of Artificial Intelligence-Based Voice Assistants: The Case of Amazon Alexa , 2021, SN Computer Science.

[2]  Matthias Peissner,et al.  Can voice interaction help reducing the level of distraction and prevent accidents? , 2011 .

[3]  Mark Vollrath,et al.  Accident Analysis and Prevention , 2009 .

[4]  Jacob Cohen Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.

[5]  L. Cronbach Coefficient alpha and the internal structure of tests , 1951 .

[6]  Véronique Sébille,et al.  Sample size used to validate a scale: a review of publications on newly-developed patient reported outcomes measures , 2014, Health and Quality of Life Outcomes.

[7]  Martin Schrepp,et al.  Design and Evaluation of a Short Version of the User Experience Questionnaire (UEQ-S) , 2017, Int. J. Interact. Multim. Artif. Intell..

[8]  Cristiano André da Costa,et al.  Intelligent personal assistants: A systematic literature review , 2020, Expert Syst. Appl..

[9]  K. Á. T.,et al.  Towards a tool for the Subjective Assessment of Speech System Interfaces (SASSI) , 2000, Natural Language Engineering.

[10]  Diana Adler,et al.  Using Multivariate Statistics , 2016 .

[11]  Martin Schrepp,et al.  Construction and Evaluation of a User Experience Questionnaire , 2008, USAB.

[12]  Heetae Yang,et al.  Understanding adoption of intelligent personal assistants: A parasocial relationship perspective , 2018, Ind. Manag. Data Syst..

[13]  Dana Kulic,et al.  Measurement Instruments for the Anthropomorphism, Animacy, Likeability, Perceived Intelligence, and Perceived Safety of Robots , 2009, Int. J. Soc. Robotics.

[14]  P. Lachenbruch Statistical Power Analysis for the Behavioral Sciences (2nd ed.) , 1989 .

[15]  Alexandra Neukum,et al.  Increasing anthropomorphism and trust in automated driving functions by adding speech output , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).

[16]  J. B. Brooke,et al.  SUS: A 'Quick and Dirty' Usability Scale , 1996 .

[17]  Sebastian Hergeth,et al.  Self-report measures for the assessment of human–machine interfaces in automated driving , 2019, Cognition, Technology & Work.