Using root cause analysis to handle intrusion detection alarms

Using Root Cause Analysis to Handle Intrusion Detection Alarms Klaus Julisch IBM Zurich Research Laboratory Säumerstrasse 4 8803 R̈uschlikon, Switzerland e-mail: kju@zurich.ibm.com In response to attacks against enterprise networks, administrators are increasingly deploying intrusion detection systems. These systems monitor hosts, networks, and other resources for signs of security violations. Unfortunately, the use of intrusion detection has given rise to another difficult problem, namely the handling of a generally large number of mostly false alarms. This dissertation presents a novel paradigm for handling intrusion detection alarms more efficiently. Central to this paradigm is the notion that each alarm occurs for a reason, which is referred to as the alarm’s root causes . This dissertation observes that a few dozens of root causes generally account for over 90% of the alarms in an alarm log. Moreover, these root causes are generally persistent, i.e. they keep triggering alarms until someone removes them. Based on these observations, we propose a new two-step paradigm for alarm handling: Step one identifies root causes that account for large numbers of alarms, and step two removes these root causes and thereby reduces the future alarm load. Alternatively, alarms originating from benign root causes can be filtered out. To support the discovery of root causes, we propose a novel data mining technique, called larm clustering. To lay the foundation for alarm clustering, we show that many root causes manifest themselves in alarm groups that have certain structural properties. We formalize these structural properties and propose alarm clustering as a method for extracting alarm groups that have these properties. Such alarm groups are generally indicative of root causes. We therefore present them to a human expert who is responsible for identifying the underlying root causes. Once identified, the root causes can be removed (or false positives can be filtered out) so as to reduce the

[1]  Douglas H. Fisher,et al.  Noise-Tolerant Conceptual Clustering , 1989, IJCAI.

[2]  Marc Dacier,et al.  Mining intrusion detection alarms for actionable knowledge , 2002, KDD.

[3]  Graham K. Rand,et al.  Quantitative Applications in the Social Sciences , 1983 .

[4]  Gregory Piatetsky-Shapiro,et al.  Advances in Knowledge Discovery and Data Mining , 2004, Lecture Notes in Computer Science.

[5]  Yossi A. Nygate,et al.  Event correlation using rule and object based techniques , 1995, Integrated Network Management.

[6]  Wolfgang Gaul,et al.  From Data to Knowledge: Theoretical and Practical Aspects of Classification, Data Analysis, and Knowledge Organization , 1996 .

[7]  Domenico Talia,et al.  Scalable Parallel Clustering for Data Mining on Multicomputers , 2000, IPDPS Workshops.

[8]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[9]  Lisa Talbot,et al.  Data Mining for Improving Intrusion Detection , 2000 .

[10]  Michael Spann,et al.  A new approach to clustering , 1990, Pattern Recognit..

[11]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[12]  Harold S. Javitz,et al.  The SRI IDES statistical anomaly detector , 1991, Proceedings. 1991 IEEE Computer Society Symposium on Research in Security and Privacy.

[13]  Javier Béjar,et al.  Integrating Declarative Knowledge in Hierarchical Clustering Tasks , 1999, IDA.

[14]  Peter C. Cheeseman,et al.  Bayesian Classification (AutoClass): Theory and Results , 1996, Advances in Knowledge Discovery and Data Mining.

[15]  Dimitrios Gunopulos,et al.  Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[16]  Carla E. Brodley,et al.  Feature Subset Selection and Order Identification for Unsupervised Learning , 2000, ICML.

[17]  Salvatore J. Stolfo,et al.  Mining in a data-flow environment: experience in network intrusion detection , 1999, KDD '99.

[18]  Yijun Lu,et al.  Concept Hierarchy in Data Mining: Specificat ion, Generat ion and Implement at ion , 1997 .

[19]  L. A. Goodman,et al.  Measures of association for cross classifications , 1979 .

[20]  Marc Dacier,et al.  A revised taxonomy for intrusion-detection systems , 2000, Ann. des Télécommunications.

[21]  Klaus Julisch Dealing with False Positives in Intrusion Detection , 2000 .

[22]  Fionn Murtagh,et al.  NEURAL NETWORKS FOR CLUSTERING , 1996 .

[23]  G. W. Milligan,et al.  An examination of the effect of six types of error perturbation on fifteen clustering algorithms , 1980 .

[24]  Sushil Jajodia,et al.  Integrating Data Mining Techniques with Intrusion Detection Methods , 1999, DBSec.

[25]  Stuart Staniford-Chen,et al.  Practical Automated Detection of Stealthy Portscans , 2002, J. Comput. Secur..

[26]  Marc Dacier,et al.  Fixed- vs. variable-length patterns for detecting suspicious process behavior , 2000 .

[27]  Mark Benard,et al.  Real-time anomaly detection using a nonparametric pattern recognition approach , 1991, Proceedings Seventh Annual Computer Security Applications Conference.

[28]  M Paradies,et al.  Root cause analysis at the Savannah River Plant , 1988 .

[29]  Jiawei Han,et al.  Data-Driven Discovery of Quantitative Rules in Relational Databases , 1993, IEEE Trans. Knowl. Data Eng..

[30]  Limsoon Wong,et al.  DATA MINING TECHNIQUES , 2003 .

[31]  Rajeev Motwani,et al.  Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[32]  Richard E. Neapolitan,et al.  Probabilistic reasoning in expert systems - theory and algorithms , 2012 .

[33]  Koral Ilgun,et al.  USTAT: a real-time intrusion detection system for UNIX , 1993, Proceedings 1993 IEEE Computer Society Symposium on Research in Security and Privacy.

[34]  Stefanos Manganaris,et al.  A Data Mining Analysis of RTID Alarms , 2000, Recent Advances in Intrusion Detection.

[35]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[36]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[37]  Yun Peng,et al.  A Probabilistic Causal Model for Diagnostic Problem Solving Part I: Integrating Symbolic Causal Inference with Numeric Probabilistic Inference , 1987, IEEE Transactions on Systems, Man, and Cybernetics.

[38]  G. De Soete,et al.  Clustering and Classification , 2019, Data-Driven Science and Engineering.

[39]  Peter Cheeseman,et al.  Bayesian classification theory , 1991 .

[40]  U. Fayyad,et al.  Scaling EM (Expectation Maximization) Clustering to Large Databases , 1998 .

[41]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[42]  Ron Kohavi,et al.  Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[43]  Sushil Jajodia,et al.  Applications of Data Mining in Computer Security , 2002, Advances in Information Security.

[44]  Brian Everitt,et al.  Cluster analysis , 1974 .

[45]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[46]  Joseph L. Hellerstein,et al.  Mining Event Data for Actionable Patterns , 2000, Int. CMG Conference.

[47]  Sudipto Guha,et al.  CURE: an efficient clustering algorithm for large databases , 1998, SIGMOD '98.

[48]  Marc Dacier,et al.  Intrusion detection , 1999, Comput. Networks.

[49]  Frédéric Cuppens,et al.  Alert correlation in a cooperative intrusion detection framework , 2002, Proceedings 2002 IEEE Symposium on Security and Privacy.

[50]  V. J. Rayward-Smith,et al.  Fuzzy Cluster Analysis: Methods for Classification, Data Analysis and Image Recognition , 1999 .

[51]  G. Jakobson,et al.  Alarm correlation , 1993, IEEE Network.

[52]  P. Arabie,et al.  Mapclus: A mathematical programming approach to fitting the adclus model , 1980 .

[53]  Thomas Henry Ptacek,et al.  Insertion, Evasion, and Denial of Service: Eluding Network Intrusion Detection , 1998 .

[54]  Robert J. Latino,et al.  Root Cause Analysis: Improving Performance for Bottom Line Results , 1999 .

[55]  G. W. Milligan,et al.  CLUSTERING VALIDATION: RESULTS AND IMPLICATIONS FOR APPLIED ANALYSES , 1996 .

[56]  Wynne Hsu,et al.  Post-Analysis of Learned Rules , 1996, AAAI/IAAI, Vol. 1.

[57]  J. V. Ness,et al.  Admissible clustering procedures , 1971 .

[58]  Heikki Mannila,et al.  Finding interesting rules from large sets of discovered association rules , 1994, CIKM '94.

[59]  TERRAN LANE,et al.  Temporal sequence learning and data reduction for anomaly detection , 1999, TSEC.

[60]  Alfonso Valdes,et al.  Probabilistic Alert Correlation , 2001, Recent Advances in Intrusion Detection.

[61]  Kwok-Yan Lam,et al.  A data reduction method for intrusion detection , 1996, J. Syst. Softw..

[62]  Stefan Kätker,et al.  Fault Isolation and Event Correlation for Integrated Fault Management , 1997, Integrated Network Management.

[63]  Steven M. Bellovin,et al.  Packets found on an internet , 1993, CCRV.

[64]  Vladimir Batagelj,et al.  Some types of clustering with relational constraints , 1983 .

[65]  Peter L. Brooks,et al.  Visualizing data , 1997 .

[66]  R. Blashfield,et al.  A Nearest-Centroid Technique for Evaluating the Minimum-Variance Clustering Procedure. , 1980 .

[67]  Phipps Arabie,et al.  AN OVERVIEW OF COMBINATORIAL DATA ANALYSIS , 1996 .

[68]  Michael R. Anderberg,et al.  Cluster Analysis for Applications , 1973 .

[69]  S. T. Buckland,et al.  Computer Intensive Statistical Methods: Validation, Model Selection, and Bootstrap , 1993 .

[70]  King-Sun Fu,et al.  IEEE Transactions on Pattern Analysis and Machine Intelligence Publication Information , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[71]  Roy Rada,et al.  Development and application of a metric on semantic nets , 1989, IEEE Trans. Syst. Man Cybern..

[72]  Roy Rada,et al.  Ranking documents with a thesaurus , 1989, JASIS.

[73]  Seraphin B. Calo,et al.  Alarm correlation and fault identification in communication networks , 1994, IEEE Trans. Commun..

[74]  Sandeep Kumar,et al.  Classification and detection of computer intrusions , 1996 .

[75]  Stefan Axelsson,et al.  The base-rate fallacy and the difficulty of intrusion detection , 2000, TSEC.

[76]  Tian Zhang,et al.  BIRCH: A New Data Clustering Algorithm and Its Applications , 1997, Data Mining and Knowledge Discovery.

[77]  Heikki Mannila,et al.  Discovering Generalized Episodes Using Minimal Occurrences , 1996, KDD.

[78]  Gunar E. Liepins,et al.  Detection of anomalous computer session activity , 1989, Proceedings. 1989 IEEE Symposium on Security and Privacy.

[79]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[80]  Richard F. Gunst,et al.  Applied Regression Analysis , 1999, Technometrics.

[81]  Aidong Zhang,et al.  WaveCluster: a wavelet-based clustering approach for spatial data in very large databases , 2000, The VLDB Journal.

[82]  George Karypis,et al.  C HAMELEON : A Hierarchical Clustering Algorithm Using Dynamic Modeling , 1999 .

[83]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[84]  Seraphin B. Calo,et al.  Towards a practical alarm correlation system , 1995, Integrated Network Management.

[85]  Javier Béjar,et al.  Generality-Based Conceptual Clustering with Probabilistic Concepts , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[86]  G. W. Milligan,et al.  The Effect of Cluster Size, Dimensionality, and the Number of Clusters on Recovery of True Cluster Structure , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[87]  Lundy M. Lewis,et al.  A Case-Based Reasoning Approach to the Resolution of Faults in Communication Networks , 1993, Integrated Network Management.

[88]  William H. E. Day,et al.  COMPLEXITY THEORY: AN INTRODUCTION FOR PRACTITIONERS OF CLASSIFICATION , 1996 .

[89]  Frédéric Cuppens,et al.  Managing alerts in a multi-intrusion detection environment , 2001, Seventeenth Annual Computer Security Applications Conference.

[90]  Claire Cardie,et al.  Clustering with Instance-Level Constraints , 2000, AAAI/IAAI.

[91]  Sushil Jajodia,et al.  Abstraction-based intrusion detection in distributed environments , 2001, TSEC.

[92]  Chris Clifton,et al.  Developing custom intrusion detection filters using data mining , 2000, MILCOM 2000 Proceedings. 21st Century Military Communications. Architectures and Technologies for Information Superiority (Cat. No.00CH37155).

[93]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[94]  Daniel A. Keim,et al.  An Efficient Approach to Clustering in Large Multimedia Databases with Noise , 1998, KDD.

[95]  Leonard Pitt,et al.  Criteria for polynomial-time (conceptual) clustering , 2004, Machine Learning.

[96]  Carla E. Brodley,et al.  Temporal sequence learning and data reduction for anomaly detection , 1998, CCS '98.

[97]  J. Breckenridge Replicating Cluster Analysis: Method, Consistency, and Validity. , 1989, Multivariate behavioral research.

[98]  G. W. Milligan,et al.  A monte carlo study of thirty internal criterion measures for cluster analysis , 1981 .

[99]  Mika Klemettinen,et al.  A Knowledge Discovery Methodology for Telecommunication Network Alarm Databases , 1999 .

[100]  Salvatore J. Stolfo,et al.  A framework for constructing features and models for intrusion detection systems , 2000, TSEC.

[101]  Johannes Gehrke,et al.  CACTUS—clustering categorical data using summaries , 1999, KDD '99.

[102]  Ryszard S. Michalski,et al.  Automated Construction of Classifications: Conceptual Clustering Versus Numerical Taxonomy , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[103]  Abraham Silberschatz,et al.  On Subjective Measures of Interestingness in Knowledge Discovery , 1995, KDD.

[104]  Salvatore J. Stolfo,et al.  Data Mining Approaches for Intrusion Detection , 1998, USENIX Security Symposium.

[105]  Robert K. Cunningham,et al.  Fusing A Heterogeneous Alert Stream Into Scenarios , 2002, Applications of Data Mining in Computer Security.

[106]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[107]  Renée J. Miller,et al.  Association rules over interval data , 1997, SIGMOD '97.

[108]  Joshua Zhexue Huang,et al.  A Fast Clustering Algorithm to Cluster Very Large Categorical Data Sets in Data Mining , 1997, DMKD.

[109]  Eugene H. Spafford,et al.  Using internal sensors for computer intrusion detection , 2001 .

[110]  S. E. Smaha Haystack: an intrusion detection system , 1988, [Proceedings 1988] Fourth Aerospace Computer Security Applications.

[111]  Leonid Portnoy,et al.  Intrusion detection with unlabeled data using clustering , 2000 .

[112]  Hervé Debar,et al.  Aggregation and Correlation of Intrusion-Detection Alerts , 2001, Recent Advances in Intrusion Detection.

[113]  Anil K. Jain,et al.  Clustering techniques: The user's dilemma , 1976, Pattern Recognit..

[114]  Richard Lippmann,et al.  The Effect of Identifying Vulnerabilities and Patching Software on the Utility of Network Intrusion Detection , 2002, RAID.

[115]  Luis Talavera,et al.  Dependency-based feature selection for clustering symbolic data , 2000, Intell. Data Anal..

[116]  Barak A. Pearlmutter,et al.  Detecting intrusions using system calls: alternative data models , 1999, Proceedings of the 1999 IEEE Symposium on Security and Privacy (Cat. No.99CB36344).

[117]  Aris Floratos,et al.  Combinatorial pattern discovery in biological sequences: The TEIRESIAS algorithm [published erratum appears in Bioinformatics 1998;14(2): 229] , 1998, Bioinform..

[118]  André Hardy,et al.  An examination of procedures for determining the number of clusters in a data set , 1994 .

[119]  Mark Weissman,et al.  Real-time telecommunication network management: extending event correlation with temporal constraints , 1995, Integrated Network Management.

[120]  Vern Paxson,et al.  Bro: a system for detecting network intruders in real-time , 1998, Comput. Networks.

[121]  Adrian E. Raftery,et al.  How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis , 1998, Comput. J..

[122]  John McHugh,et al.  Testing Intrusion detection systems: a critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performed by Lincoln Laboratory , 2000, TSEC.

[123]  Magnus Almgren,et al.  Application-Integrated Data Collection for Security Monitoring , 2001, Recent Advances in Intrusion Detection.

[124]  William L. Fithen,et al.  State of the Practice of Intrusion Detection Technologies , 2000 .

[125]  Ramez Elmasri,et al.  Fundamentals of Database Systems , 1989 .

[126]  Roger N. Shepard,et al.  Additive clustering: Representation of similarities as combinations of discrete overlapping properties. , 1979 .

[127]  Jiawei Han,et al.  Exploration of the power of attribute-oriented induction in data mining , 1995, KDD 1995.

[128]  Sushil Jajodia,et al.  Detecting Novel Network Intrusions Using Bayes Estimators , 2001, SDM.

[129]  Salvatore J. Stolfo,et al.  A data mining framework for building intrusion detection models , 1999, Proceedings of the 1999 IEEE Symposium on Security and Privacy (Cat. No.99CB36344).

[130]  Anil K. Jain,et al.  Clustering Methodologies in Exploratory Data Analysis , 1980, Adv. Comput..

[131]  J. S. Urban Hjorth,et al.  Computer Intensive Statistical Methods: Validation, Model Selection, and Bootstrap , 1993 .

[132]  Richard C. Dubes,et al.  Cluster Analysis and Related Issues , 1993, Handbook of Pattern Recognition and Computer Vision.

[133]  Joseph E. Faulkner,et al.  Empirical Taxonomy of Religious Individuals: An Investigation among College Students , 1979 .

[134]  Hans-Hermann Bock,et al.  PROBABILITY MODELS AND HYPOTHESES TESTING IN PARTITIONING CLUSTER ANALYSIS , 1996 .

[135]  Jiawei Han,et al.  Efficient Rule-Based Attribute-Oriented Induction for Data Mining , 2000, Journal of Intelligent Information Systems.

[136]  Jiawei Han,et al.  Knowledge Discovery in Databases: An Attribute-Oriented Approach , 1992, VLDB.

[137]  Jiawei Han,et al.  Dynamic Generation and Refinement of Concept Hierarchies for Knowledge Discovery in Databases , 1994, KDD Workshop.

[138]  Douglas H. Fisher,et al.  Knowledge Acquisition Via Incremental Conceptual Clustering , 1987, Machine Learning.

[139]  Gilles Bisson Conceptual Clustering in a First Order Logic Representation , 1992, ECAI.

[140]  Rajeev Rastogi,et al.  Data Mining Meets Network Management: The NEMESIS Project , 2001, DMKD.

[141]  Yun Peng,et al.  A Probabilistic Causal Model for Diagnostic Problem Solving Part II: Diagnostic Strategy , 1987, IEEE Transactions on Systems, Man, and Cybernetics.

[142]  Matt Ganis,et al.  SOCKS Protocol Version 5 , 1996, RFC.

[143]  D. Kleinbaum,et al.  Applied Regression Analysis and Multivariable Methods , 1999 .

[144]  D. Ohsie Modeled abductive inference for event management and correlation , 1998 .

[145]  Klaus Julisch,et al.  Mining alarm clusters to improve alarm handling efficiency , 2001, Seventeenth Annual Computer Security Applications Conference.

[146]  A. D. Gordon Null Models in Cluster Validation , 1996 .

[147]  A. K. Pujari,et al.  Data Mining Techniques , 2006 .

[148]  L C Morey,et al.  A Comparison of Cluster Analysis Techniques Withing a Sequential Validation Framework. , 1983, Multivariate behavioral research.

[149]  Sudipto Guha,et al.  ROCK: a robust clustering algorithm for categorical attributes , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[150]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[151]  S. Diehl,et al.  Software visualization , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[152]  Sunita Sarawagi,et al.  Modeling multidimensional databases , 1997, Proceedings 13th International Conference on Data Engineering.

[153]  Anil K. Jain,et al.  Validity studies in clustering methodologies , 1979, Pattern Recognit..

[154]  Vladimir Estivill-Castro,et al.  Why so many clustering algorithms: a position paper , 2002, SKDD.

[155]  John D. Howard,et al.  An analysis of security incidents on the Internet 1989-1995 , 1998 .

[156]  Paul S. Bradley,et al.  Refining Initial Points for K-Means Clustering , 1998, ICML.

[157]  Chi-Hoon Lee,et al.  On Data Clustering Analysis: Scalability, Constraints, and Validation , 2002, PAKDD.

[158]  Carl E. Landwehr,et al.  A taxonomy of computer program security flaws , 1993, CSUR.

[159]  Edwin Diday,et al.  Orders and overlapping clusters by pyramids , 1987 .

[160]  R. Sekar,et al.  A high-performance network intrusion detection system , 1999, CCS '99.

[161]  Wallace Koehler,et al.  Information science as "Little Science":The implications of a bibliometric analysis of theJournal of the American Society for Information Science , 2001, Scientometrics.

[162]  Klaus Julisch,et al.  Clustering intrusion detection alarms to support root cause analysis , 2003, TSEC.

[163]  Geoffrey H. Ball,et al.  ISODATA, A NOVEL METHOD OF DATA ANALYSIS AND PATTERN CLASSIFICATION , 1965 .

[164]  Jiawei Han,et al.  Efficient and Effective Clustering Methods for Spatial Data Mining , 1994, VLDB.

[165]  Vern Paxson,et al.  How to Own the Internet in Your Spare Time , 2002, USENIX Security Symposium.

[166]  H. S. Teng,et al.  Adaptive real-time anomaly detection using inductively generated sequential patterns , 1990, Proceedings. 1990 IEEE Computer Society Symposium on Research in Security and Privacy.

[167]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[168]  Shokri Z. Selim,et al.  K-Means-Type Algorithms: A Generalized Convergence Theorem and Characterization of Local Optimality , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[169]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[170]  Philip Resnik,et al.  Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language , 1999, J. Artif. Intell. Res..

[171]  Eugene H. Spafford,et al.  Software vulnerability analysis , 1998 .

[172]  Padhraic Smyth Breaking out of the Black-Box: Research Challenges in Data Mining , 2001, DMKD.

[173]  Ramakrishnan Srikant,et al.  Mining quantitative association rules in large relational tables , 1996, SIGMOD '96.

[174]  Michalis Vazirgiannis,et al.  c ○ 2001 Kluwer Academic Publishers. Manufactured in The Netherlands. On Clustering Validation Techniques , 2022 .

[175]  G. W. Milligan,et al.  A Review Of Monte Carlo Tests Of Cluster Analysis. , 1981, Multivariate behavioral research.

[176]  Marc Dacier,et al.  A Lightweight Tool for Detecting Web Server Attacks , 2000, NDSS.

[177]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery: An Overview , 1996, Advances in Knowledge Discovery and Data Mining.

[178]  Heikki Mannila,et al.  Discovery of Frequent Episodes in Event Sequences , 1997, Data Mining and Knowledge Discovery.

[179]  N. Draper,et al.  Applied Regression Analysis , 1966 .

[180]  A. Raftery,et al.  Model-based Gaussian and non-Gaussian clustering , 1993 .