Studying the evolution of libre software projects using publicly available data

Libre software projects offer abundant information about themselves in publicly available storages (source code snapshots, CVS repositories, etc), which are a good source of quantitative data about the project itself, and the software it produces. The retrieval (and partially the analysis) of all those data can be automated, following a simple methodology aimed at characterizing the evolution of the project. Since the base information is public, and the tools used are libre and readily available, other groups can easily reproduce and review the results. Since the characterization offers some insight on the details of the project, it can be used as the basis for qualitative analysis (including correlations and comparative studies). In some cases, this methodology could also be used for proprietary software (although usually losing the benefits of peer review). This approach is shown, as an example, applied to MONO, a libre software project implementing parts of the .NET framework.