Towards a Marketplace of Open Source Software Data

Development, distribution and use of open source software comprise a market of data (source code, bug reports, documentation, number of downloads, etc.) from projects, developers and users. This large amount of data hampers people to make sense of implicit links between software projects, e.g., Dependencies, patterns, licenses. This context raises the question of what techniques and mechanisms can be used to help users and developers to link related pieces of information across software projects. In this paper, we propose a framework for a marketplace enhanced using linked open data (LOD) technology for linking software artifacts within projects as well as across software projects. The marketplace provides the infrastructure for collecting and aggregating software engineering data as well as developing services for mining, statistics, analytics and visualization of software data. Based on cross linking software artifacts and projects, the marketplace enables developers and users to understand the individual value of components and their relationship to bigger software systems. Improved understanding creates new business opportunities for software companies: users will be able to analyze and compare projects, developers can increase the visibility of their products, and hosts may offer plugins and services over the data to paying customers.

[1]  Premkumar T. Devanbu,et al.  Fair and balanced?: bias in bug-fix datasets , 2009, ESEC/FSE '09.

[2]  Oleksandr Panchenko,et al.  Precise and Scalable Querying of Syntactical Source Code Patterns Using Sample Code Snippets and a Database , 2011, 2011 IEEE 19th International Conference on Program Comprehension.

[3]  Yonggang Zhang,et al.  Empowering Software Maintainers with Semantic Web Technologies , 2007, ESWC.

[4]  Premkumar T. Devanbu,et al.  The missing links: bugs and bug-fix commits , 2010, FSE '10.

[5]  J. Euzenat,et al.  Ontology Matching , 2007, Springer Berlin Heidelberg.

[6]  Andy Schürr,et al.  MDI: A Rule-based Multi-document and Tool Integration Approach , 2006, Software & Systems Modeling.

[7]  Frank Budinsky,et al.  Eclipse Modeling Framework , 2003 .

[8]  Audris Mockus,et al.  Expertise Browser: a quantitative approach to identifying expertise , 2002, Proceedings of the 24th International Conference on Software Engineering. ICSE 2002.

[9]  Tim Berners-Lee,et al.  Linked data , 2020, Semantic Web for the Working Ontologist.

[10]  Abraham Bernstein,et al.  Mining Software Repositories with iSPAROL and a Software Evolution Ontology , 2007, Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007).

[11]  Bijan Parsia,et al.  SPARQL-DL: SPARQL Query for OWL-DL , 2007, OWLED.

[12]  Iman Keivanloo,et al.  Towards sharing source code facts using linked data , 2011, SUITE '11.

[13]  Adrian Kuhn,et al.  A trustability metric for code search based on developer karma , 2010, SUITE '10.

[14]  Thomas Fritz,et al.  Using information fragments to answer the questions developers ask , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[15]  Rosalva E. Gallardo-Valencia,et al.  Searching for reputable source code on the web , 2010, GROUP.

[16]  Richard F. Paige,et al.  Rigorous identification and encoding of trace-links in model-driven engineering , 2010, Software & Systems Modeling.

[17]  Tim Berners-Lee,et al.  Linked Data - The Story So Far , 2009, Int. J. Semantic Web Inf. Syst..

[18]  Michael Hausenblas,et al.  LD2SD: Linked Data Driven Software Development , 2009, SEKE.

[19]  Thomas D. LaToza,et al.  Hard-to-answer questions about code , 2010, PLATEAU '10.

[20]  Giuliano Antoniol,et al.  Towards the Integration of Versioning Systems, Bug Reports and Source Code Meta-Models , 2005, SETra@ICGT.

[21]  Fernando Silva Parreiras,et al.  OWLizing: transforming software models to ontologies , 2010, ODiSE'10.

[22]  Ralph Johnson,et al.  design patterns elements of reusable object oriented software , 2019 .

[23]  Iman Keivanloo,et al.  A Linked Data platform for mining software repositories , 2012, 2012 9th IEEE Working Conference on Mining Software Repositories (MSR).

[24]  Iman Keivanloo,et al.  SE-CodeSearch: A scalable Semantic Web-based source code search infrastructure , 2010, 2010 IEEE International Conference on Software Maintenance.