An empirically-based characterization and quantification of information seeking through mailing lists during Open Source developers' software evolution

Abstract Context Several authors have proposed information seeking as an appropriate perspective for studying software evolution. Empirical evidence in this area suggests that substantial time delays can accrue, due to the unavailability of required information, particularly when this information must travel across geographically distributed sites. Objective As a first step in addressing the time delays that can occur in information seeking for distributed Open Source (OS) programmers during software evolution, this research characterizes the information seeking of OS developers through their mailing lists. Method A longitudinal study that analyses 17 years of developer mailing list activity in total, over 6 different OS projects is performed, identifying the prevalent information types sought by developers, from a qualitative, grounded analysis of this data. Quantitative analysis of the number-of-responses and response time-lag is also performed. Results The analysis shows that Open Source developers are particularly implementation centric and team focused in their use of mailing lists, mirroring similar findings that have been reported in the literature. However novel findings include the suggestion that OS developers often require support regarding the technology they use during development, that they refer to documentation fairly frequently and that they seek implementation-oriented specifics based on system design principles that they anticipate in advance. In addition, response analysis suggests a large variability in the response rates for different types of questions, and particularly that participants have difficulty ascertaining information on other developer’s activities. Conclusion The findings provide insights for those interested in supporting the information needs of OS developer communities: They suggest that the tools and techniques developed in support of co-located developers should be largely mirrored for these communities: that they should be implementation centric, and directed at illustrating “how” the system achieves its functional goals and states. Likewise they should be directed at determining the reason for system bugs: a type of question frequently posed by OS developers but less frequently responded to.

[1]  Michael Philippsen,et al.  Re-Evaluating Inheritance Depth on the Maintainability of Object-Oriented Software , 1998 .

[2]  Carl Gutwin,et al.  Group awareness in distributed software development , 2004, CSCW.

[3]  Andreas Zeller,et al.  Mining metrics to predict component failures , 2006, ICSE.

[4]  Stanley Letovsky,et al.  Cognitive processes in program comprehension , 1986, J. Syst. Softw..

[5]  Brian Fitzgerald,et al.  A Critical Look at Open Source , 2004, Computer.

[6]  Thomas Fritz,et al.  Using information fragments to answer the questions developers ask , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[7]  Anneliese Amschler Andrews,et al.  Program Comprehension During Software Maintenance and Evolution , 1995, Computer.

[8]  Khaironi Yatim Sharif,et al.  Open Source Programmers' Information Seeking During Software Maintenance , 2011 .

[9]  B. J. Oates,et al.  Researching Information Systems and Computing , 2005 .

[10]  Rainer Koschke,et al.  How do professional developers comprehend software? , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[11]  Dewayne E. Perry,et al.  A Design for Evidence-based Software Architecture Research , 2005 .

[12]  Yongjun Zhang,et al.  Program plan matching: experiments with a constraint-based approach , 2000, Sci. Comput. Program..

[13]  Iyad Zayour,et al.  Adoption of reverse engineering tools: a cognitive perspective and methodology , 2001, Proceedings 9th International Workshop on Program Comprehension. IWPC 2001.

[14]  Linus Torvalds,et al.  Just for Fun: The Story of an Accidental Revolutionary , 2001 .

[15]  D. Faems,et al.  Book review: Open Innovation: Researching a New Paradigm / by H. Chesbrough, W. Vanhaverbeke and J. West. - Oxford University Press. - ISBN 978-0-19929072-7 , 2008 .

[16]  Judith Good Programming paradigms, information types and graphical representations : empirical investigations of novice program comprehension , 1999 .

[17]  Andrea De Lucia,et al.  Understanding function behaviors through program slicing , 1996, WPC '96. 4th Workshop on Program Comprehension.

[18]  Susan Elliott Sim,et al.  Supporting Multiple Program Comprehension Strategies During Software Maintenance , 1998 .

[19]  D P Hartmann,et al.  Considerations in the choice of interobserver reliability estimates. , 1977, Journal of applied behavior analysis.

[20]  Janice Singer,et al.  Studying work practices to assist tool design in software engineering , 1998, Proceedings. 6th International Workshop on Program Comprehension. IWPC'98 (Cat. No.98TB100242).

[21]  Jonas Gamalielsson,et al.  Sustainability of Open Source software communities beyond a fork: How and why has the LibreOffice project evolved? , 2014, J. Syst. Softw..

[22]  Bhavana S. Pansare,et al.  Information Needs in Bug Reports : Improving Cooperation between Developers and Users , 2015 .

[23]  Khaironi Yatim Sharif,et al.  Further Observation of Open Source Programmers' Information Seeking , 2009, PPIG.

[24]  Lynn Westbrook,et al.  Qualitative Research Methods: A Review of Major Stages, Data Analysis Techniques, and Quality Controls. , 1994 .

[25]  Klaus Krippendorff,et al.  Content Analysis: An Introduction to Its Methodology , 1980 .

[26]  R. Bogdan,et al.  Qualitative Research in Education. An Introduction to Theory and Methods. Third Edition. , 1998 .

[27]  J. West,et al.  Open innovation : researching a new paradigm , 2008 .

[28]  Anol Bhattacherjee,et al.  Governance practices and software maintenance: A study of open source projects , 2012, Decis. Support Syst..

[29]  R. Bogdan Qualitative research for education : an introduction to theory and methods / by Robert C. Bogdan and Sari Knopp Biklen , 1997 .

[30]  Alan Reid,et al.  Guidelines for Reporting and Evaluating Qualitative Research: What are the alternatives? , 2000 .

[31]  Peter Pirolli,et al.  Information Foraging , 2009, Encyclopedia of Database Systems.

[33]  Muhammad Ali Babar,et al.  Key factors for adopting inner source , 2014, ACM Trans. Softw. Eng. Methodol..

[34]  S. Katz Patterns of Evolution in Open Source Projects : A Categorization Schema and Implications 1 , 2009 .

[35]  Christoph Treude,et al.  How do programmers ask and answer questions on the web?: NIER track , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[36]  Eliot Soloway,et al.  Workshop on empirical studies of programmers , 1986, SOEN.

[37]  Thomas D. LaToza,et al.  Developers ask reachability questions , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[38]  Robert Viseur,et al.  Forks impacts and motivations in free and open source projects , 2012 .

[39]  E. Burton Swanson,et al.  Characteristics of application software maintenance , 1978, CACM.

[40]  Ian A. Harwood,et al.  Developing scenarios for post-merger and acquisition integration: a grounded theory of 'risk bartering' , 2001 .

[41]  Brian Fitzgerald,et al.  Understanding Free Software Developers: Findings from the FLOSS Study , 2007 .

[42]  Sudheendra Hangal,et al.  Automatic dimension inference and checking for object-oriented programs , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[43]  Robert DeLine,et al.  Information Needs in Collocated Software Development Teams , 2007, 29th International Conference on Software Engineering (ICSE'07).

[44]  Gail C. Murphy,et al.  Asking and Answering Questions during a Programming Change Task , 2008, IEEE Transactions on Software Engineering.

[45]  Banu Diri,et al.  A systematic review of software fault prediction studies , 2009, Expert Syst. Appl..

[46]  A. Bonaccorsi,et al.  Why Open Source Software Can Succeed , 2002 .

[47]  Brad A. Myers,et al.  A framework and methodology for studying the causes of software errors in programming systems , 2005, J. Vis. Lang. Comput..

[48]  Naresh R. Pandit The creation of theory: A recent application of the grounded theory method , 1996 .

[49]  Bill Curtis,et al.  A field study of the software design process for large systems , 1988, CACM.

[50]  Thomas Green Instructions and descriptions: some cognitive aspects of programming and similar activities , 2000, AVI '00.

[51]  Carolyn B. Seaman,et al.  The information gathering strategies of software maintainers , 2002, International Conference on Software Maintenance, 2002. Proceedings..

[52]  Elliot Soloway,et al.  Empirical Studies of Programmers: Second Workshop , 1991 .

[53]  Eric S. Raymond,et al.  The Cathedral and the Bazaar , 2000 .

[54]  Roger S. Pressman,et al.  Software Engineering: A Practitioner's Approach , 1982 .

[55]  Gail C. Murphy,et al.  Questions programmers ask during software evolution tasks , 2006, SIGSOFT '06/FSE-14.

[56]  Jim Buckley,et al.  An Open-Source Analysis Schema for Identifying Software Comprehension Processes , 2001, PPIG.

[57]  Richard C. Waters,et al.  The programmer's apprentice , 1990, ACM Press frontier series.

[58]  M. Markus The governance of free/open source software projects: monolithic, multidimensional, or configurational? , 2007 .

[59]  Susan Wiedenbeck,et al.  What do novices learn during program comprehension? , 1991, Int. J. Hum. Comput. Interact..

[60]  Andrea Arcuri,et al.  On the automation of fixing software bugs , 2008, ICSE Companion '08.

[61]  Kate Ehrlich,et al.  Empirical Studies of Programming Knowledge , 1984, IEEE Transactions on Software Engineering.

[62]  Dewayne E. Perry,et al.  Prototyping a process monitoring experiment , 1994, Proceedings of 1993 15th International Conference on Software Engineering.

[63]  Bogdan Dit,et al.  Feature location in source code: a taxonomy and survey , 2013, J. Softw. Evol. Process..

[64]  Timothy M. Koponen,et al.  Open source software maintenance process framework , 2005, ACM SIGSOFT Softw. Eng. Notes.

[65]  Alexandre Bergel,et al.  Asking and Answering Questions during a Programming Change Task in Pharo Language , 2014, PLATEAU.

[66]  W. Lewis Johnson,et al.  Interactive explanation of software systems , 1995, Proceedings 1995 10th Knowledge-Based Software Engineering Conference.

[67]  A. von Mayrhauser,et al.  From code understanding needs to reverse engineering tool capabilities , 1993, Proceedings of 6th International Workshop on Computer-Aided Software Engineering.

[68]  Kelly Patricia Kingrey Concepts of Information Seeking and Their Presence in the Practical Library Literature , 2005 .

[69]  Khaironi Yatim Sharif Open source programmers’ information seeking , 2012 .

[70]  Andreas Zeller,et al.  Predicting faults from cached history , 2008, ISEC '08.

[71]  Michael English,et al.  Fault detection and prediction in an open-source software project , 2009, PROMISE '09.

[72]  Eric S. Raymond,et al.  Homesteading the Noosphere , 1998, First Monday.

[73]  R. Riley,et al.  Revealing Socially Constructed Knowledge Through Quasi-Structured Interviews and Grounded Theory Analysis , 1996 .

[74]  Brian Fitzgerald,et al.  Understanding open source software development , 2002 .

[75]  Brad A. Myers,et al.  Extracting and answering why and why not questions about Java program output , 2010, TSEM.

[76]  Sven Apel,et al.  Types and modularity for implicit invocation with implicit announcement , 2010, TSEM.

[77]  J. Herbsleb,et al.  Two case studies of open source software development: Apache and Mozilla , 2002, TSEM.

[78]  David Budgen,et al.  Realising evidence-based software engineering a report from the workshop held at ICSE 2005 , 2005, SOEN.

[79]  Stacy Marsella,et al.  Task oriented software understanding , 1998, Proceedings 13th IEEE International Conference on Automated Software Engineering (Cat. No.98EX239).

[80]  Anselm L. Strauss,et al.  Basics of qualitative research : techniques and procedures for developing grounded theory , 1998 .

[81]  Clay Wilson,et al.  Network Centric Operations: Background and Oversight Issues for Congress , 2007 .

[82]  Thomas D. LaToza,et al.  Hard-to-answer questions about code , 2010, PLATEAU '10.

[83]  James D. Herbsleb,et al.  Program comprehension as fact finding , 2007, ESEC-FSE '07.

[84]  Maria Joao C. Sousa,et al.  A survey on the Software Maintenance Process , 1998, Proceedings. International Conference on Software Maintenance (Cat. No. 98CB36272).

[85]  Jonathan Sillito,et al.  Searching and skimming: An exploratory study , 2009, 2009 IEEE International Conference on Software Maintenance.

[86]  Eric S. Raymond,et al.  The cathedral and the bazaar - musings on Linux and Open Source by an accidental revolutionary , 2001 .

[87]  Judith Good,et al.  Characterizing programmers' information-seeking during software evolution , 2004, 12 International Workshop on Software Technology and Engineering Practice (STEP'04).

[88]  Khaironi Yatim Sharif,et al.  Observing Open Source Programmers' Information Seeking , 2008, PPIG.