Contextual multi-armed bandits for web server defense

In this paper we argue that contextual multi-armed bandit algorithms could open avenues for designing self-learning security modules for computer networks and related tasks. The paper has two contributions: a conceptual and an algorithmical one. The conceptual contribution is to formulate the real-world problem of preventing HTTP-based attacks on web servers as a one-shot sequential learning problem, namely as a contextual multi-armed bandit. Our second contribution is to present CMABFAS, a new and computationally very cheap algorithm for general contextual multi-armed bandit learning that specifically targets domains with finite actions. We illustrate how CMABFAS could be used to design a fully self-learning meta filter for web servers that does not rely on feedback from the end-user (i.e., does not require labeled data) and report first convincing simulation results.

[1]  Xin Xu,et al.  Sequential anomaly detection based on temporal-difference learning: Principles, models and case studies , 2010, Appl. Soft Comput..

[2]  Csaba Szepesvári,et al.  Tuning Bandit Algorithms in Stochastic Environments , 2007, ALT.

[3]  Eli Upfal,et al.  Multi-Armed Bandits in Metric Spaces ∗ , 2008 .

[4]  Aleksandrs Slivkins,et al.  Contextual Bandits with Similarity Information , 2009, COLT.

[5]  Connie M. Borror,et al.  Robustness of the Markov-chain model for cyber-attack detection , 2004, IEEE Transactions on Reliability.

[6]  John Langford,et al.  Telling humans and computers apart automatically , 2004, CACM.

[7]  T. L. Lai Andherbertrobbins Asymptotically Efficient Adaptive Allocation Rules , 2022 .

[8]  András György,et al.  Continuous Time Associative Bandit Problems , 2007, IJCAI.

[9]  Tyler Lu,et al.  Showing Relevant Ads via Lipschitz Context Multi-Armed Bandits , 2010 .

[10]  John Langford,et al.  Cover trees for nearest neighbor , 2006, ICML.

[11]  Shun-Zheng Yu,et al.  A Large-Scale Hidden Semi-Markov Model for Anomaly Detection on User Browsing Behaviors , 2009, IEEE/ACM Transactions on Networking.

[12]  Philippe Rigollet,et al.  Nonparametric Bandits with Covariates , 2010, COLT.

[13]  Yuxin Ding,et al.  Host-based intrusion detection using dynamic and static behavioral models , 2003, Pattern Recognit..

[14]  Roy T. Fielding,et al.  Hypertext Transfer Protocol - HTTP/1.1 , 1997, RFC.

[15]  Csaba Szepesvári,et al.  Online Optimization in X-Armed Bandits , 2008, NIPS.

[16]  Sanjeev R. Kulkarni,et al.  Arbitrary side observations in bandit problems , 2005, Adv. Appl. Math..

[17]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[18]  Ahmed Awad E. Ahmed,et al.  A New Biometric Technology Based on Mouse Dynamics , 2007, IEEE Transactions on Dependable and Secure Computing.