National Boundaries and Semantics of Artefacts in Open Source Development

Global software development has long being recognised as a paradigm shift in modern software development. As an immediate effect, co-location of workers in the same building or office is not seen as necessary any longer. Coordination in distributed socio-technical systems is mostly achieved by means of the artifacts that are produced by the developers part of a project's team. Geographic distance profoundly affects the ability to collaborate. With communication becoming less frequent, the challenge is for it to become more effective. This is especially complex when different nationalities, languages and cultures are part of the same development effort. Open source software is an example of a distributed, multi-lingual development effort. As such, the main resulting artefacts are discussions, and source code. Diverse backgrounds can produce a different semantic corpus if the authors come from the same ethnic and language groups or from different ones. The purpose of this paper is to evaluate the artifacts in the context of their semantics, and how semantic corpora are affected by development and languages. By using a selection of Open Source projects developed within national boundaries, we compare their semantic richness, and how their class content is reflected in their identifiers. We also compare these national projects to a successful, international project. The aim is to discover how national boundaries influence the semantics of the developed code.

[1]  Gabriele Bavota,et al.  An empirical study on the developers' perception of software coupling , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[2]  Dhavalkumar Thakker,et al.  Capturing the semantics of individual viewpoints on social signals in interpersonal communication , 2012 .

[3]  Andrea Capiluppi,et al.  Semantic Coupling Between Classes: Corpora or Identifiers? , 2016, ESEM.

[4]  Gabriele Bavota,et al.  A two-step technique for extract class refactoring , 2010, ASE.

[5]  Ita Richardson,et al.  Global software development and collaboration: barriers and solutions , 2010, INROADS.

[6]  Denys Poshyvanyk,et al.  Combining Conceptual and Domain-Based Couplings to Detect Database and Code Dependencies , 2012, 2012 IEEE 12th International Working Conference on Source Code Analysis and Manipulation.

[7]  Denys Poshyvanyk,et al.  Integrating conceptual and logical couplings for change impact analysis in software , 2013, Empirical Software Engineering.

[8]  E. James Whitehead,et al.  Collaboration in Software Engineering: A Roadmap , 2007, Future of Software Engineering (FOSE '07).

[9]  Jesús M. González-Barahona,et al.  Determining the Geographical distribution of a Community by means of a Time-zone Analysis , 2016, OpenSym.

[10]  Fuchun Peng,et al.  N-GRAM-BASED AUTHOR PROFILES FOR AUTHORSHIP ATTRIBUTION , 2003 .

[11]  Tibor Gyimóthy,et al.  Using information retrieval based coupling measures for impact analysis , 2009, Empirical Software Engineering.

[12]  Kari Laitinen,et al.  Estimating understandability of software documents , 1996, SOEN.

[13]  Gabriele Bavota,et al.  SCOTCH: Test-to-code traceability using slicing and conceptual coupling , 2011, 2011 27th IEEE International Conference on Software Maintenance (ICSM).

[14]  Robert Feldt,et al.  Validity Threats in Empirical Software Engineering Research - An Initial Survey , 2010, SEKE.

[15]  Denys Poshyvanyk,et al.  Blending Conceptual and Evolutionary Couplings to Support Change Impact Analysis in Source Code , 2010, 2010 17th Working Conference on Reverse Engineering.

[16]  Tun Lin Moe,et al.  Success Criteria and Factors for International Development Projects: A Life-Cycle-Based Framework , 2008 .

[17]  Gabriele Bavota,et al.  Methodbook: Recommending Move Method Refactorings via Relational Topic Models , 2014, IEEE Transactions on Software Engineering.

[18]  James Mayfield,et al.  Character N-Gram Tokenization for European Language Text Retrieval , 2004, Information Retrieval.

[19]  David Lorge Parnas,et al.  Precise Documentation: The Key to Better Software , 2010, The Future of Software Engineering.

[20]  Gabriele Bavota,et al.  Improving software modularization via automated analysis of latent topics and dependencies , 2014, TSEM.