Methodology for Assessing the State of the Practice for Domain X

To improve software development methods and tools for research software, we first need to understand the current state of the practice. Therefore, we have developed a methodology for assessing the state of the software development practices for a given research software domain. The methodology is applied to one domain at a time in recognition that software development in different domains is likely to have adopted different best practices. Moreover, providing a means to measure different domains facilitates comparison of development practices between domains. For each domain we wish to answer questions such as: i) What artifacts (documents, code, test cases, etc.) are present? ii) What tools are used? iii) What principles, process and methodologies are used? iv) What are the pain points for developers? v) What actions are used to improve qualities like maintainability and reproducibility? To answer these questions, our methodology prescribes the following steps: i) Identify the domain; ii) Identify a list of candidate software packages; iii) Filter the list to a length of about 30 packages; iv) Gather source code and documentation for each package; v) Collect repository related data on each software package, like number of stars, number of open issues, number of lines of code; vi) Fill in the measurement template (the template consists of 108 questions to assess 9 qualities (including the qualities of installability, usability and visibility)); vii) Interview developers (the interview consists of 20 questions and takes about an hour); viii) Rank the software using the Analytic Hierarchy Process (AHP); and, ix) Analyze the data to answer the questions posed above. A domain expert should be engaged throughout the process, to ensure that implicit information about the domain is properly represented and to assist with conducting an analysis of the commonalities and variabilities between the 30 selected packages. Using our methodology, spreadsheet templates and AHP tool, we estimate (based on our experience with using the process) the time to complete an assessment for a given domain at 173 person hours. 2012 ACM Subject Classification Software and its engineering; Software and its engineering → Software product lines; General and reference → Empirical studies

[1]  Bo Kågström,et al.  GEMM-based level 3 BLAS: high-performance model implementations and performance evaluation benchmark , 1998, TOMS.

[2]  Jacques Carette,et al.  Statistical Software for Psychology: Comparing Development Practices Between CRAN and Other Communities , 2018, ArXiv.

[3]  David M. Weiss Commonality Analysis: A Systematic Process for Defining Families , 1998, ESPRIT ARES Workshop.

[4]  Greg Miller,et al.  A Scientist's Nightmare: Software Problem Leads to Five Retractions , 2006, Science.

[5]  David Lorge Parnas,et al.  On the Design and Development of Program Families , 2001, IEEE Transactions on Software Engineering.

[6]  Wilhelm Hasselbring,et al.  Software Engineering for Computational Science: Past, Present, Future , 2018, Computing in Science & Engineering.

[7]  Jacques Carette,et al.  Seismology software: state of the practice , 2018, Journal of Seismology.

[8]  Spencer Smith,et al.  Commonality and Requirements Analysis for Mesh Generating Software , 2004, SEKE.

[9]  Roland Lindh,et al.  2MOLCAS as a development platform for quantum chemistry software , 2004 .

[10]  Thulasi Jegatheesan Case Studies in Document Driven Design of Scientific Computing Software , 2016 .

[11]  Mark A. Ardis,et al.  Defining families - Commonality analysis , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[12]  Thomas L. Saaty,et al.  Multicriteria Decision Making: The Analytic Hierarchy Process: Planning, Priority Setting, Resource Allocation , 1990 .

[13]  Barna A. Szabó,et al.  Finite element analysis in professional practice , 1996 .

[14]  Michael L. Van de Vanter,et al.  Scientific Computing's Productivity Gridlock: How Software Engineering Can Help , 2009, Computing in Science & Engineering.

[15]  Elaine M. Raybourn,et al.  Talk to Me: A Case Study on Coordinating Expertise in Large-Scale Scientific Software Projects , 2018, 2018 IEEE 14th International Conference on e-Science (e-Science).

[16]  Marc-Oliver Gewaltig,et al.  Quality and sustainability of software tools in neuroscience , 2012, ArXiv.

[17]  David Lorge Parnas,et al.  A rational design process: How and why to fake it , 1986, IEEE Transactions on Software Engineering.

[18]  W. Spencer Smith,et al.  Commonality Analysis of Families of Physical Models for use in Scientific Computing , 2008 .

[19]  D. Post,et al.  Computational Science Demands a New Paradigm , 2005 .

[20]  Jacques Carette,et al.  State of the Practice for GIS Software , 2018, ArXiv.

[21]  Jacques Carette,et al.  State of the practice for mesh generation and mesh processing software , 2016, Adv. Eng. Softw..