Tracing known security vulnerabilities in software repositories - A Semantic Web enabled modeling approach

The introduction of the Internet has revolutionized not only our society but also transformed the software industry, with knowledge and information sharing becoming a central part of software development processes. The resulting globalization of the software industry has not only increased software reuse, but also introduced new challenges. Among the challenges, arising from the knowledge sharing is Information Security, which has emerged to become a major threat to the software development community, since not only source code but also its vulnerabilities are shared across project boundaries. Developers are unaware of such security vulnerabilities in their projects, often until a vulnerability is either exploited by attackers or made publicly available by independent security advisory databases. In this research, we present a modeling approach, which takes advantage of Semantic Web technologies, to establish traceability links between security advisory repositories and other software repositories. More specifically, we establish a unified ontological representation, which supports bi-directional traceability links between knowledge captured in software build repositories and specialized vulnerability database. These repositories can be considered trusted information silos that are typically not directly linked to other resources, such as source code repositories containing the reported instances of these problems. The novelty of our approach is that it allows us to overcome some of these traditional information silos and transform them into information hubs, which promote sharing of knowledge across repository boundaries. We conducted several experiments to illustrate the applicability of our approach by tracing existing vulnerabilities to projects which might directly or indirectly be affected by vulnerabilities inherited from other projects and libraries. We introduce two aligned ontologies, SECONT and MAVON, populated with facts extracted from NVD and Maven.We study how open-source projects are prone to security vulnerabilities.We study the impact of vulnerabilities on project dependencies.Security vulnerabilities can stay unsolved through several product releases.Vulnerable project dependencies increase constantly as the transitive level grows.

[1]  Ian Horrocks,et al.  Description Logics as Ontology Languages for the Semantic Web , 2005, Mechanizing Mathematical Reasoning.

[2]  Xindong Wu,et al.  10 Challenging Problems in Data Mining Research , 2006, Int. J. Inf. Technol. Decis. Mak..

[3]  Anupam Joshi,et al.  Wikitology: Using Wikipedia as an Ontology , 2008 .

[4]  Giuliano Antoniol,et al.  Software Artefact Traceability: the Never-Ending Challenge , 2007, ICSM.

[5]  Timothy W. Finin,et al.  A Knowledge-Based Approach to Intrusion Detection Modeling , 2012, 2012 IEEE Symposium on Security and Privacy Workshops.

[6]  Abraham Bernstein,et al.  Mining Software Repositories with iSPAROL and a Software Evolution Ontology , 2007, Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007).

[7]  Arie van Deursen,et al.  Tracking known security vulnerabilities in proprietary software systems , 2015, 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[8]  G Stix,et al.  The mice that warred. , 2001, Scientific American.

[9]  Lorena Otero-Cerdeira,et al.  Ontology matching: A literature review , 2015, Expert Syst. Appl..

[10]  Alexander Serebrenik,et al.  Security and emotion: sentiment analysis of security discussions on GitHub , 2014, MSR 2014.

[11]  Natalya F. Noy,et al.  Semantic integration: a survey of ontology-based approaches , 2004, SGMD.

[12]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[13]  Federico Girosi,et al.  An improved training algorithm for support vector machines , 1997, Neural Networks for Signal Processing VII. Proceedings of the 1997 IEEE Signal Processing Society Workshop.

[14]  Andreas Zeller,et al.  Predicting vulnerable software components , 2007, CCS '07.

[15]  Martin Burger,et al.  Mining trends of library usage , 2009, IWPSE-Evol '09.

[16]  Capers Jones Globalization of software supply and demand , 1994, IEEE Software.

[17]  Nahid Shahmehri,et al.  An Ontology of Information Security , 2007, Int. J. Inf. Secur. Priv..

[18]  Andres Löh,et al.  NixOS: a purely functional Linux distribution , 2008, ICFP 2008.

[19]  Timothy W. Finin,et al.  A Target-Centric Ontology for Intrusion Detection , 2003, IJCAI 2003.

[20]  Peter Friess,et al.  Internet of Things: Converging Technologies for Smart Environments and Integrated Ecosystems , 2013 .

[21]  Timothy W. Finin,et al.  Extracting Cybersecurity Related Linked Data from Text , 2013, 2013 IEEE Seventh International Conference on Semantic Computing.

[22]  David Hovemeyer,et al.  Finding bugs is easy , 2004, SIGP.

[23]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[24]  Christian Bizer,et al.  DBpedia spotlight: shedding light on the web of documents , 2011, I-Semantics '11.

[25]  Cristina V. Lopes,et al.  A dataset for maven artifacts and bug patterns found in them , 2014, MSR 2014.

[26]  Measuring the Occurrence of Security-Related Bugs through Software Evolution , 2012, 2012 16th Panhellenic Conference on Informatics.

[27]  Jeffrey S. Foster,et al.  A comparison of bug finding tools for Java , 2004, 15th International Symposium on Software Reliability Engineering.

[28]  Luigi Coppolino,et al.  From Intrusion Detection to Intrusion Detection and Diagnosis: An Ontology-Based Approach , 2009, SEUS.

[29]  Jun Han,et al.  Security Attack Ontology for Web Services , 2006, SKG.

[30]  Robert Laurini Pre-consensus Ontologies and Urban Databases , 2007, Ontologies for Urban Development.

[31]  Georgios Gousios,et al.  The bug catalog of the maven ecosystem , 2014, MSR 2014.

[32]  Ravendar Lal Information Extraction of cyber security related terms and concepts from unstructured text , 2013 .

[33]  Axel Korthaus,et al.  KOntoR: An Ontology-enabled Approach to Software Reuse , 2006, SEKE.

[34]  Tao Xie,et al.  Identifying security bug reports via text mining: An industrial case study , 2010, 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010).

[35]  Premkumar T. Devanbu,et al.  Software engineering for security: a roadmap , 2000, ICSE '00.

[36]  Timothy W. Finin,et al.  Extracting Information about Security Vulnerabilities from Web Text , 2011, 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology.

[37]  Diego Calvanese,et al.  The Description Logic Handbook: Theory, Implementation, and Applications , 2003, Description Logic Handbook.

[38]  Dieter Fensel,et al.  Knowledge Engineering: Principles and Methods , 1998, Data Knowl. Eng..

[39]  Sonatype Maven: The Definitive Guide , 2010 .

[40]  Premkumar T. Devanbu,et al.  LaSSIE—a knowledge-based software information system , 1991, ICSE '90.

[41]  Y. Wilks,et al.  Artificial Believers: The Ascription of Belief , 1991 .

[42]  Michael D. Iannacone,et al.  Developing an Ontology for Cyber Security Knowledge Graphs , 2015, CISR.

[43]  Bob Martin,et al.  2010 CWE/SANS Top 25 Most Dangerous Software Errors , 2010 .

[44]  Anupam Joshi,et al.  Modeling Computer Attacks: An Ontology for Intrusion Detection , 2003, RAID.

[45]  Gerald Reif,et al.  SEON: a pyramid of ontologies for software evolution and its applications , 2012, Computing.

[46]  Steven Chabot A Review of “A Semantic Web Primer” , 2010 .

[47]  Mario Piattini,et al.  An Ontology For The Management Of Software Maintenance Projects , 2004, Int. J. Softw. Eng. Knowl. Eng..

[48]  Nicolas Anquetil,et al.  Organizing the Knowledge Used in Software Maintenance , 2003, J. Univers. Comput. Sci..

[49]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .