On the application of measurement theory in software engineering

Elements of measurement theory have recently been introduced into the software engineering discipline. It has been suggested that these elements should serve as the basis for developing, reasoning about, and applying measures. For example, it has been suggested that software complexity measures should be additive, that measures fall into a number of distinct types (i.e., levels of measurement: nominal, ordinal, interval, and ratio), that certain statistical techniques are not appropriate for certain types of measures (e.g., parametric statistics for less-than-interval measures), and that certain transformations are not permissible for certain types of measures (e.g., non-linear transformations for interval measures). In this paper we argue that, inspite of the importance of measurement theory, and in the context of software engineering, many of these prescriptions and proscriptions are either premature or, if strictly applied, would represent a substantial hindrance to the progress of empirical research in software engineering. This argument is based partially on studies that have been conducted by behavioral scientists and by statisticians over the last five decades. We also present a pragmatic approach to the application of measurement theory in software engineering. While following our approach may lead to violations of the strict prescriptions and proscriptions of measurement theory, we demonstrate that in practical terms these violations would have diminished consequences, especially when compared to the advantages afforded to the practicing researcher.

[1]  S. S. Stevens Mathematics, measurement, and psychophysics. , 1951 .

[2]  D-IOOO Berlin,et al.  TWO AXIOMS FOR EVALUATION MEASURES IN INFORMATION RETRIEVAL , 2001 .

[3]  S S Stevens,et al.  On the Theory of Scales of Measurement. , 1946, Science.

[4]  Sanford Labovitz,et al.  Some Observations on Measurement and Statistics , 1967 .

[5]  C. A. Boneau,et al.  A comparison of the power of the U and t tests. , 1962, Psychological review.

[6]  Claude E. Walston,et al.  A Method of Programming Measurement and Estimation , 1977, IBM Syst. J..

[7]  Curtis D. Hardyck,et al.  Weak Measurements vs. Strong Statistics: An Empirical Critique of S. S. Stevens' Proscriptions nn Statistics , 1966 .

[8]  S. S. Steivens Measurement, Statistics, and the Schemapiric View , 1968 .

[9]  P. Gardner Scales and Statistics , 1975 .

[10]  Freda Kemp Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences , 2003 .

[11]  Elaine J. Weyuker,et al.  Evaluating Software Complexity Measures , 2010, IEEE Trans. Software Eng..

[12]  Horst Zuse Properties of Object-Oriented Software Measures , 2001 .

[13]  Norman E. Fenton,et al.  Measurement : A Necessary Scientific Basis , 2004 .

[14]  Wanda J. Orlikowski,et al.  The Problem of Statistical Power in MIS Research , 1989, MIS Q..

[15]  James T. Townsend,et al.  Measurement Scales and Statistics: The Misconception Misconceived , 1984 .

[16]  Eugene Galanter,et al.  Handbook of mathematical psychology: I. , 1963 .

[17]  A. Berger,et al.  On the theory of C[alpha]-tests , 1989 .

[18]  Dennis F. Galletta,et al.  Some Cautions on the Measurement of User Information Satisfaction , 1989 .

[19]  Sanford Labovitz,et al.  The Assignment of Numbers to Rank Order Categories , 1970 .

[20]  F. Roberts Measurement Theory with Applications to Decisionmaking, Utility, and the Social Sciences: Measurement Theory , 1984 .

[21]  W. W. Daniel Applied Nonparametric Statistics , 1979 .

[22]  Helena Chmura Kraemer,et al.  How many subjects , 1989 .

[23]  R. Cranley,et al.  Multivariate Analysis—Methods and Applications , 1985 .

[24]  J. Michell Measurement scales and statistics: A clash of paradigms. , 1986 .

[25]  P. Bollmann,et al.  Two axioms for evaluation measures in information retrieval , 1984, SIGIR 1984.

[26]  Jacob Cohen Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.

[27]  Lawrence S. Mayer A Note on Treating Ordinal Data as Interval Data , 1971 .

[28]  Enrique Ivan Oviedo Control flow, data flow and program complexity , 1984 .

[29]  J. Tukey The Future of Data Analysis , 1962 .

[30]  Kyung Kyu Kim,et al.  User Information Satisfaction: Toward Conceptual Clarity , 1990, ICIS.

[31]  Blake Ives,et al.  The measurement of user information satisfaction , 1983, CACM.

[32]  Patrick Suppes,et al.  Basic measurement theory , 1962 .

[33]  Sandro Morasca,et al.  Property-Based Software Engineering Measurement , 1996, IEEE Trans. Software Eng..

[34]  Horst Zuse,et al.  Software complexity: Measures and methods , 1990 .

[35]  Victor R. Basili,et al.  A meta-model for software development resource expenditures , 1981, ICSE '81.

[36]  Patrick M. O'Malley,et al.  A Guide for selecting statistical techniques for analyzing social science data , 1982, American Political Science Review.

[37]  Leonard R. Sussman,et al.  Nominal, Ordinal, Interval, and Ratio Typologies are Misleading , 1993 .

[38]  Sanford Labovitz,et al.  In Defense of Assigning Numbers to Ranks , 1971 .

[39]  Stephen Dubin How many subjects? Statistical power analysis in research , 1990 .

[40]  Norman E. Fenton,et al.  Software Metrics: A Rigorous Approach , 1991 .

[41]  James M. Bieman,et al.  Measuring Functional Cohesion , 1994, IEEE Trans. Software Eng..

[42]  S. Siegel,et al.  Nonparametric Statistics for the Behavioral Sciences , 2022, The SAGE Encyclopedia of Research Design.

[43]  Chris F. Kemerer,et al.  A Metrics Suite for Object Oriented Design , 2015, IEEE Trans. Software Eng..

[44]  Frederick Mosteller,et al.  Data Analysis and Regression , 1978 .

[45]  J. Gibbons Nonparametric measures of association , 1993 .

[46]  R. Luce,et al.  Measurement, scaling, and psychophysics. , 1988 .