Evaluating Statistical Models for Network Traffic Anomaly Detection

Large organizations may have hundreds or thousands of applications running simultaneously to support their operations. To maintain high levels of efficiency, they need to quickly detect outages or anomalies in order to quickly fix the problem and reduce costs. This paper describes the analytical framework for a network traffic data anomaly-detection method to reduce application downtime and the need for human involvement in detecting or reporting anomalous application behavior. We use the described framework to compare the performances of a Seasonal Autoregressive Integrated Moving Average (SARIMA) times series model and Long Short-Term Memory (LSTM) Autoencoder model at anomaly detection. We evaluated these models using false positive rates and accuracy, with a requirement of being able to give timely alerts, and saw that even though both models were accurate, their false positive rates were very high. We then improved overall detection performance by ensembling the SARIMA and LSTM autoencoder. Our results demonstrate a possible new method of anomaly detection in network traffic flow using time series and autoencoders.