论文信息 - Evaluating Adaptive Deception Strategies for Cyber Defense with Human Adversaries

Evaluating Adaptive Deception Strategies for Cyber Defense with Human Adversaries

We investigate the effectiveness of various algorithms for defensive cyber‐deception in an adversarial decision‐making task using human experiments. Our combinatorial Multi‐Armed Bandit task represents an abstract version of a realistic problem in cybersecurity: allocating limited resources for defense in a way that an adversary can be most successfully deceived to attack “fake” nodes (i.e., honeypots) instead of the real ones. We propose six algorithms with different degrees of determinism, adaptivity, and customization to the human adversary's actions. We test these algorithms in six separate behavioral studies, where humans are paired against each of the six types of defense. We measure the effectiveness of the algorithms according to how humans learn the defense strategies, which is a reflection of the success of the algorithms in deceiving human adversaries. We find that the adaptivity of the strategy is more important than the expected optimality of the algorithm. Humans learned and took advantage of defense algorithms that are deterministic, nonadaptive, and not customized. At the same time, not all algorithms that were nondeterministic, adaptive, and customized, were effective. The Learning with Linear Rewards (LLR) algorithm, one that was purely adaptive, was the most successful; suggesting that adaptivity is an important feature of defense algorithms. New ways to customize the defense strategies to the adversary's behavior are needed.

[1] Cleotilde Gonzalez,et al. Design of Dynamic and Personalized Deception: A Research Framework and New Insights , 2020, HICSS.

[2] Milind Tambe,et al. Adaptive Cyber Deception: Cognitively Informed Signaling for Cyber Defense , 2020, HICSS.

[3] Bhaskar Krishnamachari,et al. Combinatorial Network Optimization With Unknown Variables: Multi-Armed Bandits With Linear Rewards and Individual Observations , 2010, IEEE/ACM Transactions on Networking.

[4] Cleotilde Gonzalez,et al. Exploration and exploitation during information search and experimential choice , 2016 .

[5] Sunny Fugate,et al. Game theory for adaptive defensive cyber deception , 2018, HotSoS.

[6] Mohammed H. Almeshekah,et al. Cyber Security Deception , 2016, Cyber Deception.

[7] Cleotilde Gonzalez,et al. Instance‐based Learning: A General Model of Repeated Binary Choice , 2012 .

[8] H. Robbins. Some aspects of the sequential design of experiments , 1952 .

[9] I. Erev,et al. Learning, risk attitude and hot stoves in restless bandit problems , 2009 .

[10] P M Todd,et al. Précis of Simple heuristics that make us smart , 2000, Behavioral and Brain Sciences.

[11] Cleotilde Gonzalez,et al. Instance-based learning in dynamic decision making , 2003 .