SE-CodeSearch: A scalable Semantic Web-based source code search infrastructure

Available code search engines provide typically coarse-grained lexical search. To address this limitation we present SE-CodeSearch, a Semantic Web-based approach for Internet-scale source code search. It uses an ontological representation of source code facts and analysis knowledge to complete missing information using inference engine. This approach allows us to reason and search across project boundaries containing often incomplete code fragments extracted in a one-pass and no-order manner. The infrastructure provides a scalable approach to process and query across large code bases mined from software repositories and code fragments found online. We have implemented our SE-CodeSearch as part of SE-Advisor framework to demonstrate the scalability and applicability of our Internet-scale code search in a software maintenance context.

[1]  Collin McMillan,et al.  A search engine for finding highly relevant applications , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[2]  Christopher A. Welty,et al.  Augmenting abstract syntax trees for program understanding , 1997, Proceedings 12th IEEE International Conference Automated Software Engineering.

[3]  Sushil Krishna Bajracharya,et al.  Sourcerer: a search engine for open source code supporting structure-based search , 2006, OOPSLA '06.

[4]  Colin Atkinson,et al.  Code Conjurer: Pulling Reusable Software out of Thin Air , 2008, IEEE Software.

[5]  A. Bernstein,et al.  Analyzing Software with iSPARQL , 2007 .

[6]  Boris Motik,et al.  Bridging the gap between OWL and relational databases , 2007, WWW '07.

[7]  Rosalva E. Gallardo-Valencia,et al.  Internet-Scale Code Search , 2009, 2009 ICSE Workshop on Search-Driven Development-Users, Infrastructure, Tools and Evaluation.

[8]  C. R. Ramakrishnan,et al.  Incremental and demand-driven points-to analysis using logic programming , 2005, PPDP.

[9]  R. Holmes,et al.  Using structural context to recommend source code examples , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[10]  René Witte,et al.  Beyond Information Silos - an Omnipresent Approach to Software Evolution , 2008, Int. J. Semantic Comput..

[11]  James Fogarty,et al.  Assieme: finding and leveraging implicit references in a web search interface for programmers , 2007, UIST '07.

[12]  Sushil Krishna Bajracharya,et al.  Mining search topics from a code search engine usage log , 2009, 2009 6th IEEE International Working Conference on Mining Software Repositories.

[13]  Koushik Sen,et al.  SNIFF: A Search Engine for Java Using Free-Form Queries , 2009, FASE.

[14]  Steven P. Reiss,et al.  Semantics-based code search , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[15]  Deborah L. McGuinness,et al.  OWL Web ontology language overview , 2004 .

[16]  Shinji Kusumoto,et al.  Ranking significance of software components based on use relations , 2003, IEEE Transactions on Software Engineering.