Generating Hard to Comprehend Fake Documents for Defensive Cyber Deception

Existing approaches to cyber defense have been inadequate at defending the targets from advanced persistent threats (APTs). APTs are stealthy and orchestrated attacks, which target both corporations and governments to exfiltrate important data. In this paper, we present a novel comprehensibility manipulation framework (CMF) to generate a haystack of hard to comprehend fake documents, which can be used for deceiving attackers and increasing the cost of data exfiltration by wasting their time and resources. CMF requires an original document as input and generates fake documents that are both believable and readable for the attacker, possess no important information, and are hard to comprehend. To evaluate CMF, we experimented with college aptitude tests and compared the performance of many readers on separate reading comprehension exercises with fake and original content. Our results showed a statistically significant difference in the correct responses to the same questions across the fake and original exercises, thus validating the effectiveness of CMF operations to mislead.