Calibrating the experimental measurement of psychological attributes.

Behavioural researchers often seek to experimentally manipulate, measure and analyse latent psychological attributes, such as memory, confidence or attention. The best measurement strategy is often difficult to intuit. Classical psychometric theory, mostly focused on individual differences in stable attributes, offers little guidance. Hence, measurement methods in experimental research are often based on tradition and differ between communities. Here we propose a criterion, which we term 'retrodictive validity', that provides a relative numerical estimate of the accuracy of any given measurement approach. It is determined by performing calibration experiments to manipulate a latent attribute and assessing the correlation between intended and measured attribute values. Our approach facilitates optimising measurement strategies and quantifying uncertainty in the measurement. Thus, it allows power analyses to define minimally required sample sizes. Taken together, our approach provides a metrological perspective on measurement practice in experimental research that complements classical psychometrics.

[1]  L. Cronbach Five perspectives on the validity argument. , 1988 .

[2]  D. Bach,et al.  Modeling startle eyeblink electromyogram to assess fear learning , 2016, Psychophysiology.

[3]  Denny Borsboom,et al.  Cognitive psychology meets psychometric theory: on the relation between process models for decision making and latent variable models for individual differences. , 2011, Psychological review.

[4]  L. Cronbach,et al.  Construct validity in psychological tests. , 1955, Psychological bulletin.

[5]  C. Korn,et al.  Psychophysiological modeling: Current state and future directions. , 2018, Psychophysiology.

[6]  Brian A. Nosek,et al.  The preregistration revolution , 2018, Proceedings of the National Academy of Sciences.

[7]  D. Bach,et al.  Psychophysiological modelling and the measurement of fear conditioning , 2020, Behaviour research and therapy.

[8]  Jan Richter,et al.  Navigating the garden of forking paths for data exclusions in fear conditioning research , 2019, eLife.

[9]  Mathew H. Evans,et al.  Many Analysts, One Data Set: Making Transparent How Variations in Analytic Choices Affect Results , 2018, Advances in Methods and Practices in Psychological Science.

[10]  M. R. Novick The axioms and principal results of classical test theory , 1965 .

[11]  John P. A. Ioannidis,et al.  A manifesto for reproducible science , 2017, Nature Human Behaviour.

[12]  M. Fullana,et al.  Fear Extinction Retention: Is It What We Think It Is? , 2019, Biological Psychiatry.

[13]  W. T. Estler,et al.  A Careful Consideration of the Calibration Concept , 2001, Journal of research of the National Institute of Standards and Technology.

[14]  Frederic M. Lord,et al.  A strong true-score theory, with applications , 1965 .

[15]  D. Bach Traumatische Erinnerungen medikamentös abschwächen , 2017 .

[16]  P. Venables,et al.  Publication recommendations for electrodermal measurements. , 1981 .

[17]  Marko Sarstedt,et al.  Quantify uncertainty in behavioral research , 2020, Nature Human Behaviour.

[18]  Francis Tuerlinckx,et al.  Increasing Transparency Through a Multiverse Analysis , 2016, Perspectives on psychological science : a journal of the Association for Psychological Science.

[19]  Patrick Suppes,et al.  Representational Measurement Theory , 2002 .

[20]  D. Bach,et al.  Measuring learning in human classical threat conditioning: Translational, cognitive and methodological considerations , 2020, Neuroscience & Biobehavioral Reviews.

[21]  J. Houwer Why the Cognitive Approach in Psychology Would Profit From a Functional Approach and Vice Versa , 2011, Perspectives on psychological science : a journal of the Association for Psychological Science.

[22]  D. Bach,et al.  Blocking human fear memory with the matrix metalloproteinase inhibitor doxycycline , 2017, Molecular Psychiatry.

[23]  Leif D. Nelson,et al.  False-Positive Psychology , 2011, Psychological science.

[24]  Volker Steuber,et al.  Modules for Automated Validation and Comparison of Models of Neurophysiological and Neurocognitive Biomarkers of Psychiatric Disorders: ASSRUnit—A Case Study , 2017, Computational Psychiatry.

[25]  W. Estler Measurement as Inference: Fundamental Ideas , 1999 .

[26]  Karl J. Friston,et al.  Model-based analysis of skin conductance responses: Towards causal models in psychophysiology. , 2013, Psychophysiology.

[27]  B. Cuthbert,et al.  Committee report: Guidelines for human startle eyeblink electromyographic studies. , 2005, Psychophysiology.

[28]  Richard McElreath,et al.  The natural selection of bad science , 2016, Royal Society Open Science.

[29]  Brian A. Nosek,et al.  Power failure: why small sample size undermines the reliability of neuroscience , 2013, Nature Reviews Neuroscience.

[30]  J. Michell The psychometricians' fallacy: too clever by half? , 2009, The British journal of mathematical and statistical psychology.

[31]  Dan Bang,et al.  Distinct encoding of decision confidence in human medial prefrontal cortex , 2018, Proceedings of the National Academy of Sciences.