Identifying Anomalies in GridFTP transfers for Data-Intensive Science through Application-Awareness

Network anomaly detection systems can be used to identify anomalous transfers or threats, which, when undetected, can trigger large-scale malicious events. Data-intensive science projects rely on high-throughput computing and high-speed networking resources for data analysis and processing. In this paper, we propose an anomaly detection framework and architecture for identifying anomalies in GridFTP transfers. Application-awareness plays an important role in our proposed architecture and is used to communicate GridFTP application metadata to the machine learning and anomaly detection system. We demonstrate the effectiveness of our architecture by evaluating the framework with a real-world, large-scale dataset of GridFTP transfers. Preliminary results show that our framework can be used to develop novel anomaly detection services with diverse feature sets for distributed and data-intensive projects.

[1]  Vinod Yegneswaran,et al.  Athena: A Framework for Scalable Anomaly Detection in Software-Defined Networks , 2017, 2017 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[2]  Ramesh Govindan,et al.  SCREAM: sketch resource allocation for software-defined measurement , 2015, CoNEXT.

[3]  Steven Tuecke,et al.  GridFTP: Protocol Extensions to FTP for the Grid , 2001 .

[4]  William E. Allcock,et al.  The globus extensible input/output system (XIO): a protocol independent IO system for the grid , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[5]  Minlan Yu,et al.  Software Defined Traffic Measurement with OpenSketch , 2013, NSDI.

[6]  Joshua R. Smith,et al.  LIGO: The laser interferometer gravitational-wave observatory , 2006, QELS 2006.

[7]  Pavlin Radoslavov,et al.  ONOS: towards an open, distributed SDN OS , 2014, HotSDN.

[8]  William E. Allcock,et al.  The Globus Striped GridFTP Framework and Server , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[9]  Basil S. Maglaris,et al.  Combining OpenFlow and sFlow for an effective and scalable anomaly detection and mitigation mechanism on SDN environments , 2014, Comput. Networks.

[10]  Basil S. Maglaris,et al.  Leveraging SDN for Efficient Anomaly Detection and Mitigation on Legacy Networks , 2014, 2014 Third European Workshop on Software Defined Networks.

[11]  Ian T. Foster,et al.  Globus Toolkit Version 4: Software for Service-Oriented Systems , 2005, Journal of Computer Science and Technology.

[12]  Jorge Luis Rodriguez,et al.  The Open Science Grid , 2005 .

[13]  João Paulo Teixeira,et al.  The CMS experiment at the CERN LHC , 2008 .

[14]  Ramesh Govindan,et al.  DREAM , 2014, SIGCOMM.

[15]  Byrav Ramamurthy,et al.  SNAG : SDN-managed Network Architecture for GridFTP Transfers I , 2016 .