Refining Traceability Links Between Vulnerability and Software Component in a Vulnerability Knowledge Graph

Software vulnerabilities and their corresponding software components information are usually stored in different locations with different representations. Building accurate traceability links between them to form a unified knowledge graph can be very helpful for vulnerability spreading analysis, component dependency management, and relationship inference. In this paper, we first propose a software vulnerability knowledge graph model which integrates CVE (Common Vulnerabilities and Exposures) information, Java Component metadata in Maven repository and project collaboration data on Github. To construct the knowledge graph, we then propose two ontology matching approaches. The first one links Maven project and Github project in a URL text-matching way. The second one introduces random forests algorithm to link CVE project version and Maven project version based on 16 well-defined features. Experimental results show that matching between CVE project version and Maven project version are highly promising with an accuracy rate as high as 99.8%. The traceability links between vulnerabilities and software components can be more accurate based on our approach.

[1]  Frank van Harmelen,et al.  Using multiple ontologies as background knowledge in ontology matching , 2008 .

[2]  Ismail Akbari,et al.  An improved MLMA+ and its application in ontology matching , 2009, 2009 Innovative Technologies in Intelligent Systems and Industrial Applications.

[3]  François Scharffe,et al.  Ontology alignment design patterns , 2013, Knowledge and Information Systems.

[4]  Duy-Hoa Ngo,et al.  Enhancing Ontology Matching by Using Machine Learning, Graph Matching and Information Retrieval Techniques. (Amélioration de l'alignement d'ontologies par les techniques d'apprentissage automatique, d'appariement de graphes et de recherche d'information) , 2012 .

[5]  Viviana Mascardi,et al.  Automatic Ontology Matching via Upper Ontologies: A Systematic Evaluation , 2010, IEEE Transactions on Knowledge and Data Engineering.

[6]  Alon Y. Halevy,et al.  Semantic Integration Research in the Database Community : A Brief Survey , 2005 .

[7]  Cosmin Stroe,et al.  AgreementMaker: Efficient Matching for Large Real-World Schemas and Ontologies , 2009, Proc. VLDB Endow..

[8]  Jérôme Euzenat,et al.  Ontology Matching: State of the Art and Future Challenges , 2013, IEEE Transactions on Knowledge and Data Engineering.

[9]  Cliff Joslyn,et al.  Measuring the Structural Preservation of Semantic Hierarchy Alignment , 2009, OM.

[10]  Ellis E. Eghan,et al.  Tracing known security vulnerabilities in software repositories - A Semantic Web enabled modeling approach , 2016, Sci. Comput. Program..

[11]  Pedro M. Domingos,et al.  Ontology Matching: A Machine Learning Approach , 2004, Handbook on Ontologies.

[12]  Lorena Otero-Cerdeira,et al.  Ontology matching: A literature review , 2015, Expert Syst. Appl..

[13]  Vincenzo Loia,et al.  Hybrid methodologies to foster ontology-based knowledge management platform , 2013, 2013 IEEE Symposium on Intelligent Agents (IA).

[14]  Santiago Ontañón,et al.  Measuring Similarity in Description Logics Using Refinement Operators , 2011, ICCBR.

[15]  Mansur R. Kabuka,et al.  Ontology matching with semantic verification , 2009, J. Web Semant..