论文信息 - The Proposal of Double Agent Architecture using Actor-critic Algorithm for Penetration Testing

The Proposal of Double Agent Architecture using Actor-critic Algorithm for Penetration Testing

Reinforcement learning (RL) is a widely used machine learning method for optimal decision-making compared to rule-based methods. Because of that advantage, RL has also recently been used a lot in penetration testing (PT) problems to assist in planning and deploying cyber attacks. Although the complexity and size of networks keep increasing vastly every day, RL is currently applied only for small scale networks. This paper proposes a double agent architecture (DAA) approach that is able to drastically increase the size of the network which can be solved with RL. This work also examines the effectiveness of using current popular deep reinforcement learning algorithms including DQN, DDQN, Dueling DQN and D3QN algorithms for PT. The A2C algorithm using Wolpertinger architecture is also adopted as a baseline for comparing the results of the methods. All algorithms are evaluated using a proposed network simulator which is constructed as a Markov decision process (MDP). Our results demonstrate that DAA with A2C algorithm far outweighs other approaches when dealing with large network environments reaching up to 1000 hosts.

Tetsutaro Uehara | Atsuo Inomata | Songpon Teerakanok | Hoang Nguyen

[1] Hado van Hasselt,et al. Double Q-learning , 2010, NIPS.

[2] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[3] Cynthia A. Phillips,et al. A graph-based system for network-vulnerability analysis , 1998, NSPW '98.

[4] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[5] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.

[6] Multiple Level Action Embedding for Penetration Testing , 2020, ICFNDS.

[7] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[8] Olivier Buffet,et al. POMDPs Make Better Hackers: Accounting for Uncertainty in Penetration Testing , 2012, AAAI.

[9] Richard Evans,et al. Deep Reinforcement Learning in Large Discrete Action Spaces , 2015, 1512.07679.

[10] Michael Chui,et al. Where machines could replace humans - and where they can't (yet) , 2016 .

[11] Thomas M. Chen,et al. Reinforcement Learning for Efficient Network Penetration Testing , 2019, Inf..

[12] Hanna Kurniawati,et al. Autonomous Penetration Testing using Reinforcement Learning , 2019, ArXiv.