Evaluation criteria for human-automation performance metrics

Previous research has identified broad metric classes for human-automation performance to facilitate metric selection, as well as understanding and comparison of research results. However, there is still lack of an objective method for selecting the most efficient set of metrics. This research identifies and presents a list of evaluation criteria that can help determine the quality of a metric in terms of experimental constraints, comprehensive understanding, construct validity, statistical efficiency, and measurement technique efficiency. Future research will build on these evaluation criteria and existing generic metric classes to develop a cost-benefit analysis approach that can be used for metric selection.

[1]  Brian P. Bailey,et al.  Towards an index of opportunity: understanding changes in mental workload during task execution , 2004, CHI.

[2]  F. Thomas Eggemeier,et al.  Workload assessment methodology. , 1986 .

[3]  Mark S. Sanders,et al.  Human Factors in Engineering and Design , 1957 .

[4]  Thomas E. Nygren,et al.  The Subjective Workload Assessment Technique: A Scaling Procedure for Measuring Mental Workload , 1988 .

[5]  Birsen Donmez,et al.  Selecting Metrics to Evaluate Human Supervisory Control Applications , 2008 .

[6]  P. Pina Identifying Generalizable Metric Classes to Evaluate Human-Robot Teams , 2008 .

[7]  Peter A. Hancock,et al.  ADAPTIVE CONTROL OF MENTAL WORKLOAD. , 2001 .

[8]  S. Hart,et al.  Development of NASA-TLX (Task Load Index): Results of Empirical and Theoretical Research , 1988 .

[9]  Glenn F. Wilson,et al.  Psychophysiological responses to changes in workload during simulated air traffic control , 1996, Biological Psychology.

[10]  Thomas B. Sheridan,et al.  Humans and Automation: System Design and Research Issues , 2002 .

[11]  Christopher D. Wickens,et al.  An introduction to human factors engineering , 1997 .

[12]  Michael A. Vidulich,et al.  Testing a Subjective Metric of Situation Awareness , 1991 .

[13]  F. Thomas Eggemeier,et al.  Workload Measurement in System Design and Evaluation , 1985 .

[14]  Nancy J. Cooke,et al.  Advances in Measuring Team Cognition , 2003 .

[15]  G. Robert J. Hockey,et al.  Level of Operator Control and Changes in Heart Rate Variability during Simulated Flight Maintenance , 1995, Hum. Factors.

[16]  E. Salas,et al.  Team cognition : understanding the factors that drive process and performance , 2004 .

[17]  John G. Casali,et al.  A Validated Rating Scale for Global Mental Workload Measurement Applications , 1983 .

[18]  J G Hollands,et al.  ENGINEERING PSYCHOLOGY AND HUMAN PERFORMANCE - THIRD EDITION , 2000 .

[19]  D. de Waard,et al.  The use of psychophysiology to assess driver status. , 1993, Ergonomics.

[20]  Linda Ng Boyle,et al.  The Impact of Distraction Mitigation Strategies on Driving Performance , 2006, Hum. Factors.

[21]  Jean Scholtz,et al.  Common metrics for human-robot interaction , 2006, HRI '06.

[22]  Jerrold M. Levine,et al.  Measurement of Workload by Secondary Tasks , 1979 .

[23]  Yili Liu,et al.  Introduction to Human Factors Engineering (2nd Edition) , 2003 .

[24]  Thomas B. Sheridan,et al.  Telerobotics, Automation, and Human Supervisory Control , 2003 .

[25]  Jacob W. Crandall,et al.  Identifying Predictive Metrics for Supervisory Control of Multiple Robots , 2007, IEEE Transactions on Robotics.

[26]  Dominik Aronsky,et al.  Tracking Workload in the Emergency Department , 2006, Hum. Factors.

[27]  Jean Scholtz,et al.  Evaluation of human-robot interaction awareness in search and rescue , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[28]  Dan R. Olsen,et al.  Metrics for Evaluating Human-Robot Interactions , 2003 .

[29]  C. Wickens Engineering psychology and human performance, 2nd ed. , 1992 .

[30]  F. T. Eggemeier,et al.  The Effect of Delayed Report on Subjective Ratings of Mental Workload , 1983 .

[31]  Mark S. Crabtree,et al.  Proceedings of the Human Factors Society Annual Meeting (27th) on the Effect of Delayed Report on Subjective Ratings of Mental Workloads, Held at Norkfolk, VA on 10-14 October 1983. , 1983 .

[32]  Alphonse Chapanis,et al.  Research techniques in human engineering , 1959 .

[33]  Richard A. Johnson,et al.  Applied Multivariate Statistical Analysis , 1983 .

[34]  Donald A. Talleur,et al.  The Effect of Pilot Visual Scanning Strategies on Traffic Detection Accuracy and Aircraft Control , 2003 .

[35]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[36]  A. H. Roscoe,et al.  A Subjective Rating Scale for Assessing Pilot Workload in Flight: A decade of Practical Use , 1990 .

[37]  Kim J. Vicente,et al.  Attention Allocation within the Abstraction Hierarchy , 1997 .

[38]  J. G. Hollands,et al.  Engineering Psychology and Human Performance , 1984 .

[39]  Mariela E Buchin Assessing the impact of automated path planning aids in the maritime community , 2009 .

[40]  Linda Ng Boyle,et al.  Safety implications of providing real-time feedback to distracted drivers. , 2007, Accident; analysis and prevention.

[41]  Mary L. Cummings,et al.  Visualizing Operators' Cognitive Strategies in Multivariate Optimization , 2006 .

[42]  Chris Berka,et al.  Real-Time Analysis of EEG Indexes of Alertness, Cognition, and Memory Acquired With a Wireless EEG Headset , 2004, Int. J. Hum. Comput. Interact..