A Study of Feature Scattering in the Linux Kernel

Feature code is often scattered across a software system. Scattering is not necessarily bad if used with care, as witnessed by systems with highly scattered features that evolved successfully. Feature scattering, often realized with a pre-processor, circumvents limitations of programming languages and software architectures. Unfortunately, little is known about the principles governing scattering in large and long-living software systems. We present a longitudinal study of feature scattering in the Linux kernel, complemented by a survey with 74, and interviews with nine Linux kernel developers. We analyzed almost eight years of the kernel's history, focusing on its largest subsystem: device drivers. We learned that the ratio of scattered features remained nearly constant and that most features were introduced without scattering. Yet, scattering easily crosses subsystem boundaries, and highly scattered outliers exist. Scattering often addresses a performance-maintenance tradeoff (alleviating complicated APIs), hardware design limitations, and avoids code duplication. While developers do not consciously enforce scattering limits, they actually improve the system design and refactor code, thereby mitigating pre-processor idiosyncrasies or reducing its use.

[1]  Krzysztof Czarnecki,et al.  Variability modeling in the real: a perspective from the operating systems domain , 2010, ASE '10.

[2]  Marsha Chechik,et al.  A Survey of Feature Location Techniques , 2013, Domain Engineering, Product Lines, Languages, and Conceptual Models.

[3]  Thomas Leich,et al.  Aspectual Feature Modules , 2008, IEEE Transactions on Software Engineering.

[4]  Krzysztof Czarnecki,et al.  Where Do Configuration Constraints Stem From? An Extraction Approach and an Empirical Study , 2015, IEEE Transactions on Software Engineering.

[5]  Henry Spencer,et al.  #ifdef Considered Harmful, or Portability Experience with C News , 1992, USENIX Summer.

[6]  Krzysztof Czarnecki,et al.  The Variability Model of The Linux Kernel , 2010, VaMoS.

[7]  Sven Apel,et al.  The shape of feature code: an analysis of twenty C-preprocessor-based systems , 2017, Software & Systems Modeling.

[8]  Gregor Snelting,et al.  On the inference of configuration structures from source code , 1994, Proceedings of 16th International Conference on Software Engineering.

[9]  Cláudio Sant'Anna,et al.  On the Maintainability of Aspect-Oriented Software: A Concern-Oriented Measurement Framework , 2008, 2008 12th European Conference on Software Maintenance and Reengineering.

[10]  Barbara Paech,et al.  Using Tags to Support Feature Management Across Issue Tracking Systems and Version Control Systems - A Research Preview , 2017, REFSQ.

[11]  Krzysztof Czarnecki,et al.  Feature-to-Code Mapping in Two Large Product Lines , 2010, SPLC.

[12]  G.C. Murphy,et al.  Identifying, Assigning, and Quantifying Crosscutting Concerns , 2007, First International Workshop on Assessment of Contemporary Modularization Techniques (ACoM '07).

[13]  Marco Tulio Valente,et al.  Extracting relative thresholds for source code metrics , 2014, 2014 Software Evolution Week - IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE).

[14]  Jacob Krüger,et al.  Multi-view Editing of Software Product Lines with PEoPL , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering: Companion (ICSE-Companion).

[15]  Alexander Chatzigeorgiou,et al.  Investigating the effect of evolution and refactorings on feature scattering , 2013, Software Quality Journal.

[16]  Sven Apel,et al.  Analyzing the discipline of preprocessor annotations in 30 million lines of C code , 2011, AOSD '11.

[17]  Sven Apel,et al.  Feature scattering in the large: a longitudinal study of Linux kernel device drivers , 2015, MODULARITY.

[18]  Martin P. Robillard,et al.  Representing concerns in source code , 2007, TSEM.

[19]  Sven Apel,et al.  Feature-oriented software evolution , 2013, VaMoS.

[20]  Alexandru Tupan,et al.  Triangulation , 1997, Comput. Vis. Image Underst..

[21]  Krzysztof Czarnecki,et al.  Evolution of the Linux Kernel Variability Model , 2010, SPLC.

[22]  Sven Apel,et al.  The road to feature modularity? , 2011, SPLC '11.

[23]  Oscar Nierstrasz,et al.  Comparative analysis of evolving software systems using the Gini coefficient , 2009, 2009 IEEE International Conference on Software Maintenance.

[24]  Wolfgang Schröder-Preikschat,et al.  Is The Linux Kernel a Software Product Line , 2007 .

[25]  Sven Apel,et al.  An analysis of the variability in forty preprocessor-based software product lines , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[26]  Leeat Yariv Online Appendix , 2008 .

[27]  Martin P. Robillard,et al.  FEAT a tool for locating, describing, and analyzing concerns in source code , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[28]  Alfred V. Aho,et al.  Do Crosscutting Concerns Cause Defects? , 2008, IEEE Transactions on Software Engineering.

[29]  Michael W. Godfrey,et al.  Evolution in open source software: a case study , 2000, Proceedings 2000 International Conference on Software Maintenance.

[30]  Richard C. Holt,et al.  The Linux kernel: a case study of build system variability , 2014, J. Softw. Evol. Process..

[31]  Thorsten Berger,et al.  PEoPL: Projectional Editing of Product Lines , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[32]  Mark Kasunic,et al.  Designing an Effective Survey , 2005 .

[33]  Daniel Lohmann,et al.  Understanding linux feature distribution , 2012, MISS '12.

[34]  Yuanyuan Song,et al.  Information hiding interfaces for aspect-oriented design , 2005, ESEC/FSE-13.

[35]  Gunter Saake,et al.  Feature-Oriented Software Product Lines , 2013, Springer Berlin Heidelberg.

[36]  Marsha Chechik,et al.  What is a feature?: a qualitative study of features in industrial software product lines , 2015, SPLC.

[37]  H. J. Arnold Introduction to the Practice of Statistics , 1990 .

[38]  E. Ziegel Introduction to the Practice of Statistics (2nd ed.) , 1994 .

[39]  Gregor Kiczales,et al.  Aspect-oriented programming , 2001, ESEC/FSE-9.

[40]  Dror G. Feitelson,et al.  Perpetual development: A model of the Linux kernel life cycle , 2012, J. Syst. Softw..

[41]  Mia Hubert,et al.  An adjusted boxplot for skewed distributions , 2008, Comput. Stat. Data Anal..

[42]  Jacob Krüger,et al.  Features and How to Find Them , 2019, Software Engineering for Variability Intensive Systems.

[43]  Andreas Burger,et al.  FLOrIDA: Feature LOcatIon DAshboard for extracting and visualizing feature traces , 2017, VaMoS.

[44]  Cláudio Sant'Anna,et al.  Crosscutting patterns and design stability: An exploratory analysis , 2009, 2009 IEEE 17th International Conference on Program Comprehension.

[45]  James M. Bieman,et al.  The evolution of FreeBSD and linux , 2006, ISESE '06.

[46]  Andreas Burger,et al.  Semi-Automated Feature Traceability with Embedded Annotations , 2018, 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[47]  Krzysztof Czarnecki,et al.  A Study of Variability Models and Languages in the Systems Software Domain , 2013, IEEE Transactions on Software Engineering.

[48]  Michal Antkiewicz,et al.  Maintaining feature traceability with embedded annotations , 2015, SPLC.

[49]  Marco Tulio Valente,et al.  Extracting Software Product Lines: A Case Study Using Conditional Compilation , 2011, 2011 15th European Conference on Software Maintenance and Reengineering.

[50]  Jean-Marie Favre,et al.  Preprocessors from an abstract point of view , 1996, 1996 Proceedings of International Conference on Software Maintenance.

[51]  Markus Völter,et al.  Projecting a Modular Future , 2015, IEEE Software.

[52]  Sreekrishnan Venkateswaran Essential Linux Device Drivers , 2008 .

[53]  Andreas Burger,et al.  FINALIsT2: Feature identification, localization, and tracing tool , 2018, 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[54]  Stanley M. Sutton,et al.  N degrees of separation: multi-dimensional separation of concerns , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[55]  Jacob Krüger,et al.  Towards a Better Understanding of Software Features and Their Characteristics: A Case Study of Marlin , 2018, VaMoS.

[56]  Atul Prakash,et al.  Theories and techniques of program understanding , 1991, CASCON.

[57]  Bogdan Dit,et al.  Feature location in source code: a taxonomy and survey , 2013, J. Softw. Evol. Process..