论文信息 - Two N-of-1 self-trials on readability differences between anonymous inner classes (AICs) and lambda expressions (LEs) on Java code snippets

Two N-of-1 self-trials on readability differences between anonymous inner classes (AICs) and lambda expressions (LEs) on Java code snippets

In Java, lambda expressions (LEs) were introduced at a time where the similar language construct anonymous inner class (AIC) already existed for years. But while LEs became quite popular in mainstream programming languages in general, their usability is hardly studied. From the Java perspective the need to study the relationship between LEs and AICs was and is quite obvious, because both language constructs co-exist. However, it is quite usual that new language constructs are introduced although they are not or hardly studied using scientific methods – and an often heard argument from programming language designers is that the effort or the costs for the application of the scientific method on language constructs is too high. The present paper contributes in two different ways. First, with respect to LEs in comparison to AICs, this paper presents two N-of-1 studies (i.e. randomized control trials executed on a single subject) where LEs and AICs are used as listeners in Java code. Both experiments had two similar and rather simple tasks (“count the number of parameters”, respectively “count the number of used parameters”) with the dependent variable being reaction time. The first experiment used the number of parameters, the second the number of used parameters as the controlled, independent variable (in addition to the technique LE and AIC). Other variables (LOC, etc.) were randomly generated within given boundaries. The main result of both experiments is that LEs without type annotations require less reading time (p hs .2, reduction of reaction time of at most 35%). The results are based on 9,600 observations (one N-of-1 trial with eight replications). This gives evidence that the readability of LEs without type annotations improves the readability of code. However, the effect seems to be so small, that we do not expect this to have a larger impact on daily programming. Second, we see the contribution of this paper in the application of N-of-1 trials. Such experiments require relatively low effort in the data selection but still permit to analyze results in a non-subjective way using commonly accepted analysis techniques. Additionally, they permit to increase the number of selected data points in comparison to traditional multi–subject experiments. We think that researchers should take such experiments into account before planning and executing larger experiments.

Stefan Hanenberg | Nils Mehlhorn | Stefan Hanenberg | Nils Mehlhorn

[1] Robert Heumüller,et al. Programmers do not favor lambda expressions for concurrent object-oriented code , 2018, Empirical Software Engineering.

[2] Evan Mayo-Wilson,et al. Journal article reporting standards for quantitative research in psychology: The APA Publications and Communications Board task force report. , 2018, The American psychologist.

[3] Stefan Hanenberg,et al. An Empirical Study on the Impact of C++ Lambdas and Programmer Experience , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[4] Walter F. Tichy,et al. Should Computer Scientists Experiment More? , 1998, Computer.

[5] S. Goodman. A dirty dozen: twelve p-value misconceptions. , 2008, Seminars in hematology.

[6] D. Forbes. Blinding: an essential component in decreasing risk of bias in experimental designs , 2013, Evidence-Based Nursing.

[7] W. Shadish,et al. Characteristics of single-case designs used to assess intervention effects in 2008 , 2011, Behavior research methods.

[8] A. Rudell,et al. Effects of long-time reading experience on reaction time and the recognition potential. , 2010, International journal of psychophysiology : official journal of the International Organization of Psychophysiology.

[9] Danny Dig,et al. Understanding the use of lambda expressions in Java , 2017, Proc. ACM Program. Lang..

[10] Maciej Wilamowski,et al. The Most Influential Medical Journals According to Wikipedia: Quantitative Analysis , 2019, Journal of medical Internet research.

[11] Margaret M. Burnett,et al. A practical guide to controlled experiments of software engineering tools with human participants , 2013, Empirical Software Engineering.

[12] Stefan Hanenberg,et al. Static vs. dynamic type systems: an empirical study about the relationship between type casts and development time , 2011, DLS '11.

[13] Stefan Hanenberg,et al. The Programming Language Wars: Questions and Responsibilities for the Programming Language Community , 2014, Onward!.

[14] Stefan Hanenberg,et al. An empirical comparison of static and dynamic type systems on API usage in the presence of an IDE: Java vs. groovy with eclipse , 2014, ICPC 2014.

[15] Marvin V. Zelkowitz,et al. Experimental Models for Validating Technology , 1998, Computer.

[16] Emerson R. Murphy-Hill,et al. Adoption and use of Java generics , 2012, Empirical Software Engineering.

[17] R. Tate,et al. The Evidence Base of Neuropsychological Rehabilitation in Acquired Brain Impairment (ABI): How Good is the Research? , 2006, Brain Impairment.

[18] Stefan Hanenberg,et al. How do API documentation and static typing affect API usability? , 2014, ICSE.

[19] Marvin V. Zelkowitz,et al. An update to experimental models for validating computer technology , 2009, J. Syst. Softw..

[20] G. Guyatt,et al. The history and development of N-of-1 trials , 2017, Journal of the Royal Society of Medicine.

[21] Westley Weimer,et al. Benefits and barriers of user evaluation in software engineering research , 2011, OOPSLA '11.

[22] J D Gould,et al. Does Visual Angle of a Line of Characters Affect Reading Speed? , 1986, Human factors.

[23] D. Hoaglin,et al. Altered Placebo and Drug Labeling Changes the Outcome of Episodic Migraine Attacks , 2014, Science Translational Medicine.

[24] Janet Siegmund,et al. Shorter identifier names take longer to comprehend , 2017, 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[25] C. Dony,et al. Prototype-based languages: from a new taxonomy to constructive proposals and their validation , 1992, OOPSLA '92.

[26] Barbara A. Kitchenham,et al. Effect sizes and their variance for AB/BA crossover design studies , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[27] Michael Hoppe,et al. Do developers benefit from generic types?: an empirical comparison of generic and raw types in java , 2013, OOPSLA.

[28] Edna Dias Canedo,et al. Does the Introduction of Lambda Expressions Improve the Comprehension of Java Programs? , 2019, SBES.

[29] G. Guyatt,et al. CONSORT extension for reporting N-of-1 trials (CENT) 2015 Statement , 2015, BMJ : British Medical Journal.

[30] E. Wagenmakers,et al. Quantifying Support for the Null Hypothesis in Psychology: An Empirical Investigation , 2018, Advances in Methods and Practices in Psychological Science.

[31] Robin Henderson,et al. Dynamic modelling of n-of-1 data: powerful and flexible data analytics applied to individualised studies , 2017, Health psychology review.

[32] Drummond Rennie,et al. Trial registration: a great idea switches from ignored to irresistible. , 2004, JAMA.

[33] Stephen Senn,et al. Cross-over trials in clinical research , 1993 .