The development of libre (free/open source) software is usually performed by geographically distributed teams. Participation in most cases is voluntary, sometimes sporadic, and often not framed by a pre-defined management structure. This means that anybody can contribute, and in principle no national origin has advantages over others, except for the differences in availability and quality of Internet connections and language. However, differences in participation across regions do exist, although there are little studies about them. In this paper we present some data which can be the basis for some of those studies. We have taken the database of users registered at SourceForge, the largest libre software development web-based platform, and have inferred their geographical locations. For this, we have applied several techniques and heuristics on the available data (mainly e-mail addresses and time zones), which are presented and discussed in detail. The results show a snapshot of the regional distribution of SourceForge users, which may be a good proxy of the actual distribution of libre software developers. In addition, the methodology may be of interest for similar studies in other domains, when the available data is similar (as is the case of mailing lists related to software projects).
[1]
Jesús M. González-Barahona,et al.
Developer identification methods for integrated data from various sources
,
2005,
ACM SIGSOFT Softw. Eng. Notes.
[2]
Jane Greenberg,et al.
A Quantitative Profile of a Community of Open Source Linux Developers
,
1999
.
[3]
David Lancashire.
Code, Culture and Cash: The Fading Altruism of Open Source Development
,
2001,
First Monday.
[4]
Jesús M. González-Barahona,et al.
Applying Social Network Analysis to the Information in CVS Repositories
,
2004,
MSR.
[5]
Kevin Crowston,et al.
Collaboration using OSSmole: a repository of FLOSS data and analyses
,
2005,
MSR '05.
[6]
Ilkka Tuomi.
Evolution of the Linux Credits file: Methodological challenges and reference data for Open Source research
,
2004,
First Monday.
[7]
Kieran Healy,et al.
The Ecology of Open-Source Software Development
,
2003
.
[8]
Gregorio Robles,et al.
Remote analysis and measurement of libre software systems by means of the CVSAnalY tool
,
2004,
ICSE 2004.
[9]
Ken-ichi Matsumoto,et al.
Accelerating cross-project knowledge collaboration using collaborative filtering and social networks
,
2005,
MSR.