Rating Scales in Accounting Research: The Impact of Scale Points and Labels

ABSTRACT: Rating scales are one of the most widely used tools in behavioral research. Decisions regarding scale design can have a potentially profound effect on research findings. Despite this importance, an analysis of extant literature in top accounting journals reveals a wide variety of rating scale compositions. The purpose of this paper is to experimentally investigate the impact of scale characteristics on participants' responses. Two experiments are conducted that manipulate the number of scale points and the corresponding labels to study their influence on the statistical properties of the resultant data. Results suggest that scale design impacts the statistical characteristics of response data and emphasize the importance of labeling all scale points. A scale with all points labeled effectively minimizes response bias, maximizes variance, maximizes power, and minimizes error. This analysis also suggests variance may be maximized when the scale length is set at 7 points. Although researchers commo...

[1]  John T. Kulas,et al.  Middle Response Functioning in Likert-responses to Personality Items , 2008 .

[2]  C. Osgood The nature and measurement of meaning. , 1952, Psychological bulletin.

[3]  Jason W. Osbourne Notes on the Use of Data Transformation. , 2002 .

[4]  M. Bartlett Properties of Sufficiency and Statistical Tests , 1992 .

[5]  R Likert,et al.  A TECHNIQUE FOR THE MEASUREMENT OF ATTITUDE SCALES , 1932 .

[6]  Richard A. Stevick,et al.  Response Differences and Preferences for All-Category-Defined and End-Defined Likert Formats , 1984 .

[7]  Angela P. Wetzel Internet, mail, and mixed‐mode surveys: The tailored design method , 2010 .

[8]  J. Jacoby,et al.  Three-Point Likert Scales Are Good Enough , 1971 .

[9]  Lawrence A. Ponemon,et al.  A comment on `A Multidimensional Analysis of Selected Ethical Issues in Accounting' , 1993 .

[10]  Y. B. Wah,et al.  Power comparisons of Shapiro-Wilk , Kolmogorov-Smirnov , Lilliefors and Anderson-Darling tests , 2011 .

[11]  Subhash Sharma,et al.  The impact of the number of scale points, dispositional factors, and the status quo decision heuristic on scale reliability and response accuracy , 2005 .

[12]  H. H. Ku,et al.  Contributions to Probability and Statistics, Essays in Honor of Harold Hotelling. , 1961 .

[13]  J. Dawes Do Data Characteristics Change According to the Number of Scale Points Used? An Experiment Using 5-Point, 7-Point and 10-Point Scales , 2008 .

[14]  G. A. Miller THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .

[15]  Seymour Sudman,et al.  Maximum versus Meaningful Discrimination in Scale Response: Implications for Validity of Measurement of Consumer Perceptions about Products , 2004 .

[16]  Bruce Thompson,et al.  Score Reliability in Webor Internet-Based Surveys: Unnumbered Graphic Rating Scales versus Likert-Type Scales , 2001 .

[17]  A. Colman,et al.  Optimal number of response categories in rating scales: reliability, validity, discriminating power, and respondent preferences. , 2000, Acta psychologica.

[18]  D. Cook,et al.  Does scale length matter? A comparison of nine- versus five-point rating scales for the mini-CEX , 2009, Advances in health sciences education : theory and practice.

[19]  Tracey J. Riley,et al.  Patterns of Language Use in Accounting Narratives and Their Impact on Investment-Related Judgments and Decisions , 2014 .

[20]  W. R. Buckland,et al.  Contributions to Probability and Statistics , 1960 .

[21]  Lei Chang,et al.  Dependability of Anchoring Labels of Likert-Type Scales , 1997 .

[22]  Minghui Lai,et al.  Determining the optimal scale width for a rating scale using an integrated discrimination function , 2010 .

[23]  Willem E. Saris,et al.  Choosing the Number of Categories in Agree–Disagree Scales , 2014 .

[24]  N. Pearse,et al.  Deciding on the Scale Granularity of Response Categories of Likert Type Scales: The Case of a 21-Point Scale , 2011 .

[25]  Lawrence S. Meyers,et al.  Psychometric Properties of Four 5-Point Likert Type Response Scales , 1987 .

[26]  W. Hoeffding,et al.  Contributions to Probability and Statistics: Essays in Honor of Harold Hotelling , 1961 .

[27]  Eli P. Cox,et al.  The Optimal Number of Response Alternatives for a Scale: A Review , 1980 .

[28]  B. Weijters,et al.  The effect of rating scale format on response styles: the number of response categories and response catgory labels , 2010 .

[29]  R. Gonzalez Applied Multivariate Statistics for the Social Sciences , 2003 .

[30]  Q. Raaijmakers,et al.  Adolescents' midpoint responses on Likert-type scale items: Neutral or missing values? , 2000 .

[31]  C. L. Olson On choosing a test statistic in multivariate analysis of variance. , 1976 .

[32]  Gilbert A. Churchill,et al.  Research Design Effects on the Reliability of Rating Scales: A Meta-Analysis , 1984 .

[33]  Reto Felix The impact of scale width on responses for multi-item, self-report measures , 2011 .

[34]  S. Huck,et al.  Effect of varying the response format of the Alpert-Haber Achievement Anxiety Test. , 1974 .

[35]  Es R. Masters THE RELATIONSHIP BETWEEN NUMBER OF RESPONSE CATEGORIES AND RELIABILITY OF LIKERT‐TYPE QUESTIONNAIRES1 , 1974 .

[36]  R. Likert “Technique for the Measurement of Attitudes, A” , 2022, The SAGE Encyclopedia of Research Design.

[37]  R. Haase,et al.  Multivariate analysis of variance. , 1987 .

[38]  Harry C. Triandis,et al.  Effects of Culture and Response Format on Extreme Response Style , 1989 .

[39]  Jason W. Osbourne,et al.  Four Assumptions of Multiple Regression That Researchers Should Always Test. , 2002 .

[40]  Paul E. Green,et al.  Rating Scales and Information Recovery—How Many Scales and Response Categories to Use? , 1970 .

[41]  Hershey H. Friedman,et al.  Rating the Rating Scales , 1999 .

[42]  Stephen E. Newstead,et al.  The Effect of Response Format on Ratings of Teaching , 1989 .