Combining software interrelationship data across heterogeneous software repositories

Software interrelationships have an impact on the quality and evolution of software projects and are therefore important to development and maintenance. Package management and build systems result in software ecosystems that usually are syntactically and semantically incompatible with each other, although the described software can overlap. There is currently no general way for querying software interrelationships across these different ecosystems. In this paper, we present our approach to combine and consequently query information about software interrelationships across different ecosystems. We propose an ontology for the semantic modeling of the relationships as linked data. Furthermore, we introduce a temporal storage and query model to handle inconsistencies between different data sources. By providing a scalable and extensible architecture to retrieve and process data from multiple repositories, we establish a foundation for ongoing research activities. We evaluated our approach by integrating the data of several ecosystems and demonstrated its usefulness by creating tools for vulnerability notification and license violation detection.

[1]  Romain Robbes,et al.  WEON: towards a software ecosystem ONtology , 2013, WEA 2013.

[2]  Gerald Reif,et al.  SEON: a pyramid of ontologies for software evolution and its applications , 2012, Computing.

[3]  Sergio Segura,et al.  Debian Packages Repositories as Software Product Line Models. Towards Automated Analysis , 2010, ACoTA.

[4]  Arie van Deursen,et al.  The Maven repository dataset of metrics, changes, and dependencies , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[5]  Hridesh Rajan,et al.  Boa: A language and infrastructure for analyzing ultra-large-scale software repositories , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[6]  Gabriele Bavota,et al.  The market for open source: An intelligent virtual open source marketplace , 2014, 2014 Software Evolution Week - IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE).

[7]  Daniel M. Germán,et al.  A Model to Understand the Building and Running Inter-Dependencies of Software , 2007, 14th Working Conference on Reverse Engineering (WCRE 2007).

[8]  Roberto Di Cosmo,et al.  Managing the Complexity of Large Free and Open Source Package-Based Software Distributions , 2006, 21st IEEE/ACM International Conference on Automated Software Engineering (ASE'06).

[9]  Romain Robbes,et al.  Recovering inter-project dependencies in software ecosystems , 2010, ASE.

[10]  Harald C. Gall,et al.  A framework for semi-automated software evolution analysis composition , 2013, Automated Software Engineering.

[11]  Iman Keivanloo,et al.  Software trustworthiness 2.0 - A semantic web enabled global source code analysis approach , 2014, J. Syst. Softw..

[12]  Leon Moonen,et al.  Crossing the boundaries while analyzing heterogeneous component-based software systems , 2011, 2011 27th IEEE International Conference on Software Maintenance (ICSM).