CodeOntology: RDF-ization of Source Code

In this paper, we leverage advances in the Semantic Web area, including data modeling (RDF), data management and querying (JENA and SPARQL), to develop CodeOntology, a community-shared software framework supporting expressive queries over source code. The project consists of two main contributions: an ontology that provides a formal representation of object-oriented programming languages, and a parser that is able to analyze Java source code and serialize it into RDF triples. The parser has been successfully applied to the source code of OpenJDK 8, gathering a structured dataset consisting of more than 2 million RDF triples. CodeOntology allows to generate Linked Data from any Java project, thereby enabling the execution of highly expressive queries over source code, by means of a powerful language like SPARQL.

[1]  Maurizio Atzori,et al.  CodeOntology: Querying Source Code in a Semantic Framework , 2017, International Semantic Web Conference.

[2]  Maurizio Atzori Toward the Web of Functions: Interoperable Higher-Order Functions in SPARQL , 2014, International Semantic Web Conference.

[3]  Daniel W. Gillman,et al.  XKOS: Extending SKOS for Describing Statistical Classifications , 2013, SemStats@ISWC.

[4]  Boris Motik,et al.  HermiT: A Highly-Efficient OWL Reasoner , 2008, OWLED.

[5]  Carlos Tejo-Alonso,et al.  Metadata for Web Ontologies and Rules: Current Practices and Perspectives , 2011, MTSR.

[6]  Gail C. Murphy,et al.  Questions programmers ask during software evolution tasks , 2006, SIGSOFT '06/FSE-14.

[7]  Mark A. Linton,et al.  Implementing relational views of programs , 1984, SDE 1.

[8]  C. V. Ramamoorthy,et al.  The C Information Abstraction System , 1990, IEEE Trans. Software Eng..

[9]  Renaud Pawlak,et al.  SPOON: A library for implementing analyses and transformations of Java source code , 2016, Softw. Pract. Exp..

[10]  Sushil Krishna Bajracharya,et al.  Sourcerer: An infrastructure for large-scale collection and analysis of open-source code , 2014, Sci. Comput. Program..

[11]  Mark A. Musen,et al.  The protégé project: a look back and a look forward , 2015, SIGAI.

[12]  Jens Lehmann,et al.  DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia , 2015, Semantic Web.

[13]  Thomas R. Gruber,et al.  Toward principles for the design of ontologies used for knowledge sharing? , 1995, Int. J. Hum. Comput. Stud..

[14]  Tian Zhao,et al.  Component search and reuse: An ontology-based approach , 2010, 2010 IEEE International Conference on Information Reuse & Integration.

[15]  Gopinath Ganapathy,et al.  To Generate the Ontology from Java Source Code , 2011 .

[16]  Guy L. Steele,et al.  The Java Language Specification, Java SE 8 Edition , 2013 .

[17]  Paolo Ferragina,et al.  Fast and Accurate Annotation of Short Texts with Wikipedia Pages , 2010, IEEE Software.