Time-Aligned Similarity Calculations for Job-Centric Monitoring

In job-centric monitoring, monitors gather series of measurements, e.g., the used CPU load, per job. In domains where jobs are expected to behave similar, job-centric monitoring allows detecting misbehaving jobs based on a reference series of measurements. However, current detection approaches neglect time-drifts in series, e.g., caused by different CPU speeds and therefore potentially cause false positives.To cope with this issue, this paper introduces a novel approach to compensate such time-drifts. Our approach is based on a transformation that aligns a series of measurements to the reference series time. In a proof-of-concept with synthetic job-centric monitoring data, we show that our approach reduces the number of false positives significant.

[1]  Daniel Ch. von Grünigen Digitale Signalverarbeitung: mit einer Einführung in die kontinuierlichen Signale und Systeme , 2008 .

[2]  Piotr Indyk,et al.  Mining the stock market (extended abstract): which measure is best? , 2000, KDD '00.

[3]  Ralph Müller-Pfefferkorn,et al.  Achieving scalability for job centric monitoring in a distributed infrastructure , 2012, ARCS 2012.

[4]  S. Goldfeld,et al.  Maximization by Quadratic Hill-Climbing , 1966 .

[5]  David Beasley,et al.  An overview of genetic algorithms: Part 1 , 1993 .

[6]  Goldberg,et al.  Genetic algorithms , 1993, Robust Control Systems with Genetic Algorithms.

[7]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[8]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[9]  David Salesin,et al.  Wavelets for computer graphics: a primer.1 , 1995, IEEE Computer Graphics and Applications.

[10]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[11]  Dorothy E. Denning,et al.  An Intrusion-Detection Model , 1986, 1986 IEEE Symposium on Security and Privacy.

[12]  Hervé Debar,et al.  A neural network component for an intrusion detection system , 1992, Proceedings 1992 IEEE Computer Society Symposium on Research in Security and Privacy.

[13]  J. V. van Wijk,et al.  Cluster and calendar based visualization of time series data , 1999, Proceedings 1999 IEEE Symposium on Information Visualization (InfoVis'99).

[14]  Salvatore J. Stolfo,et al.  JAM: Java Agents for Meta-Learning over Distributed Databases , 1997, KDD.

[15]  Timothy Sherwood,et al.  Wavelet-based phase classification , 2006, 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[16]  S. E. Smaha Haystack: an intrusion detection system , 1988, [Proceedings 1988] Fourth Aerospace Computer Security Applications.

[17]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[18]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[19]  Matthias Weber,et al.  Automatic Analysis of Large Data Sets: A Walk-Through on Methods from Different Perspectives , 2013, 2013 International Conference on Cloud Computing and Big Data.

[20]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[21]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[22]  Teresa F. Lunt,et al.  A survey of intrusion detection techniques , 1993, Comput. Secur..

[23]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[24]  Clu-istos Foutsos,et al.  Fast subsequence matching in time-series databases , 1994, SIGMOD '94.

[25]  Salvatore J. Stolfo,et al.  Toward parallel and distributed learning by meta-learning , 1993 .

[26]  Eamonn J. Keogh,et al.  An Enhanced Representation of Time Series Which Allows Fast and Accurate Classification, Clustering and Relevance Feedback , 1998, KDD.

[27]  Salvatore J. Stolfo,et al.  A framework for constructing features and models for intrusion detection systems , 2000, TSEC.

[28]  Peter J. Rousseeuw,et al.  Clustering by means of medoids , 1987 .

[29]  Salvatore J. Stolfo,et al.  Data Mining Approaches for Intrusion Detection , 1998, USENIX Security Symposium.

[30]  Chen Ding,et al.  Locality phase prediction , 2004, ASPLOS XI.

[31]  Ralph Müller-Pfefferkorn,et al.  Cross-Correlation as Tool to Determine the Similarity of Series of Measurements for Big-Data Analysis Tasks , 2015, CloudCom-Asia.

[32]  Dan Gusfield Algorithms on Stings, Trees, and Sequences: Computer Science and Computational Biology , 1997, SIGACT News.

[33]  Salvatore J. Stolfo,et al.  A data mining framework for building intrusion detection models , 1999, Proceedings of the 1999 IEEE Symposium on Security and Privacy (Cat. No.99CB36344).

[34]  Sam Kwong,et al.  Genetic algorithms and their applications , 1996, IEEE Signal Process. Mag..

[35]  Jaideep Srivastava,et al.  A Comparative Study of Anomaly Detection Schemes in Network Intrusion Detection , 2003, SDM.

[36]  John J. Grefenstette,et al.  Optimization of Control Parameters for Genetic Algorithms , 1986, IEEE Transactions on Systems, Man, and Cybernetics.

[37]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[38]  Eugene H. Spafford,et al.  An architecture for intrusion detection using autonomous agents , 1998, Proceedings 14th Annual Computer Security Applications Conference (Cat. No.98EX217).

[39]  Liana L. Fong,et al.  Characterization of System Services and Their Performance Impact in Multi-core Nodes , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[40]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[41]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[42]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[43]  Markus Frank,et al.  Analysis of Series of Measurements from Job-Centric Monitoring by Statistical Functions , 2017, Comput. Sci..

[44]  Dan Boneh,et al.  On genetic algorithms , 1995, COLT '95.

[45]  Dimitrios Gunopulos,et al.  A Wavelet-Based Anytime Algorithm for K-Means Clustering of Time Series , 2003 .

[46]  Andrew W. Moore,et al.  X-means: Extending K-means with Efficient Estimation of the Number of Clusters , 2000, ICML.