ptpDG: A Purchase-To-Pay Dataset Generator for Evaluating Knowledge-Graph-Based Services

This paper introduces ptpDG, a labeled-dataset generator that generates various data assets for evaluating knowledge graph construction approaches and downstream knowledge services in the purchase-to-pay domain: While organizations sell, purchase and complain about products in a multi-agent-system simulation, a ground truth knowledge graph emerges with different kinds of purchase-to-pay processes. Based on this knowledge graph, heterogeneous electronic purchase-topay documents such as e-invoices, credit notes and orders are generated. To those documents, noise patterns are added that we have frequently encountered in real industrial data. Finally, a provenance graph is generated which contains provenance information between document elements and ground truth triples. In this way, for such privacy sensitive scenarios, ptpDG enables data-driven evaluation and its publication.