Expert-Based Fusion Algorithm of an Ensemble of Anomaly Detection Algorithms

Data fusion systems are widely used in various areas such as sensor networks, robotics, video and image processing, and intelligent system design. Data fusion is a technology that enables the process of combining information from several sources in order to form a unified picture or a decision. Today, anomaly detection algorithms (ADAs) are in use in a wide variety of applications (e.g. cyber security systems, etc.). In particular, in this research we focus on the process of integrating the output of multiple ADAs that perform within a particular domain. More specifically, we propose a two stage fusion process, which is based on the expertise of the individual ADA that is derived in the first step. The main idea of the proposed method is to identify multiple types of outliers and to find a set of expert outlier detection algorithms for each type. We propose to use semi-supervised methods. Preliminary experiments for the single-type outlier case are provided where we show that our method outperforms other benchmark methods that exist in the literature.

[1]  T. M. Berg,et al.  Model Distribution in Decentralized Multi-Sensor Data Fusion , 1991, 1991 American Control Conference.

[2]  Thomas Ertl,et al.  Spatiotemporal anomaly detection through visual analysis of geolocated Twitter messages , 2012, 2012 IEEE Pacific Visualization Symposium.

[3]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[4]  Niels Provos,et al.  CAMP: Content-Agnostic Malware Protection , 2013, NDSS.

[5]  Mikhail Petrovskiy,et al.  Outlier Detection Algorithms in Data Mining Systems , 2003, Programming and Computer Software.

[6]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[7]  Aik Choon Tan,et al.  Ensemble machine learning on gene expression data for cancer classification. , 2003, Applied bioinformatics.

[8]  Vipin Kumar,et al.  Feature bagging for outlier detection , 2005, KDD '05.

[9]  Hans-Peter Kriegel,et al.  On Evaluation of Outlier Rankings and Outlier Scores , 2012, SDM.

[10]  Christopher Krügel,et al.  A survey on automated dynamic malware-analysis techniques and tools , 2012, CSUR.

[11]  Marius Kloft,et al.  Toward Supervised Anomaly Detection , 2014, J. Artif. Intell. Res..

[12]  David Brumley,et al.  BitShred: feature hashing malware for scalable triage and semantic analysis , 2011, CCS '11.

[13]  Arnold W. M. Smeulders,et al.  The Amsterdam Library of Object Images , 2004, International Journal of Computer Vision.

[14]  Gregory J. Pottie,et al.  Fusion in the Context of Information Theory , 2003 .

[15]  Varun Chandola,et al.  Anomaly detection for symbolic sequences and time series data , 2009 .

[16]  Henry Leung,et al.  Information fusion based smart home control system and its application , 2008, IEEE Transactions on Consumer Electronics.

[17]  David A. Landgrebe,et al.  Decision fusion approach for multitemporal classification , 1999, IEEE Trans. Geosci. Remote. Sens..

[18]  Hans-Peter Kriegel,et al.  Interpreting and Unifying Outlier Scores , 2011, SDM.

[19]  Vivekanand Gopalkrishnan,et al.  Mining Outliers with Ensemble of Heterogeneous Detectors on Random Subspaces , 2010, DASFAA.

[20]  Ke Zhang,et al.  A New Local Distance-Based Outlier Detection Approach for Scattered Real-World Data , 2009, PAKDD.

[21]  Carsten Willems,et al.  Automatic analysis of malware behavior using machine learning , 2011, J. Comput. Secur..