Real-Time Knowledge Discovery and Dissemination for Intelligence Analysis

This paper describes the issues and challenges for real-time knowledge discovery and then discusses approaches and challenges for real-time data mining and stream mining. Our goal is to extract accurate information to support the emergency responder, the war fighter, as well as the intelligence analyst in a timely manner.

[1]  David Heckerman,et al.  Bayesian Networks for Knowledge Discovery , 1996, Advances in Knowledge Discovery and Data Mining.

[2]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[3]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[4]  Philip S. Yu,et al.  A framework for on-demand classification of evolving data streams , 2006, IEEE Transactions on Knowledge and Data Engineering.

[5]  Jiawei Han,et al.  High-Dimensional OLAP: A Minimal Cubing Approach , 2004, VLDB.

[6]  S. Muthukrishnan,et al.  Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries , 2001, VLDB.

[7]  Raymond T. Ng,et al.  Algorithms for Mining Distance-Based Outliers in Large Datasets , 1998, VLDB.

[8]  Mohammed J. Zaki,et al.  CHARM: An Efficient Algorithm for Closed Itemset Mining , 2002, SDM.

[9]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[10]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[11]  Hongyan Liu,et al.  Error-Adaptive and Time-Aware Maintenance of Frequency Counts over Data Streams , 2006, WAIM.

[12]  Jiawei Han,et al.  Classifying large data sets using SVMs with hierarchical clusters , 2003, KDD '03.

[13]  Anthony K. H. Tung,et al.  Ranking Outliers Using Symmetric Neighborhood Relationship , 2006, PAKDD.

[14]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[15]  Shashi Shekhar,et al.  Detecting graph-based spatial outliers: algorithms and applications (a summary of results) , 2001, KDD '01.

[16]  George Karypis,et al.  C HAMELEON : A Hierarchical Clustering Algorithm Using Dynamic Modeling , 1999 .

[17]  Laks V. S. Lakshmanan,et al.  Exploratory mining and pruning optimizations of constrained associations rules , 1998, SIGMOD '98.

[18]  Sudipto Guha,et al.  Streaming-data algorithms for high-quality clustering , 2002, Proceedings 18th International Conference on Data Engineering.

[19]  Philip S. Yu,et al.  Mining concept-drifting data streams using ensemble classifiers , 2003, KDD '03.

[20]  Jiawei Han,et al.  MAIDS: mining alarming incidents from data streams , 2004, SIGMOD '04.

[21]  Sudipto Guha,et al.  Clustering Data Streams , 2000, FOCS.

[22]  Yixin Chen,et al.  Stream Cube: An Architecture for Multi-Dimensional Analysis of Data Streams , 2005, Distributed and Parallel Databases.

[23]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[24]  Hongyan Liu,et al.  C-Cubing: Efficient Computation of Closed Cubes by Aggregation-Based Checking , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[25]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[26]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[27]  Jiawei Han,et al.  Star-Cubing: Computing Iceberg Cubes by Top-Down and Bottom-Up Integration , 2003, Very Large Data Bases Conference.

[28]  Philip S. Yu,et al.  Fast algorithms for projected clustering , 1999, SIGMOD '99.

[29]  RamakrishnanRaghu,et al.  Bottom-up computation of sparse and Iceberg CUBE , 1999 .

[30]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.

[31]  Hans-Peter Kriegel,et al.  OPTICS: ordering points to identify the clustering structure , 1999, SIGMOD '99.

[32]  Mohammed J. Zaki,et al.  SPADE: An Efficient Algorithm for Mining Frequent Sequences , 2004, Machine Learning.

[33]  Ujjwal Maulik,et al.  Advanced Methods for Knowledge Discovery from Complex Data , 2005 .

[34]  Philip S. Yu,et al.  A Framework for Clustering Evolving Data Streams , 2003, VLDB.

[35]  Kamesh Munagala,et al.  Local Search Heuristics for k-Median and Facility Location Problems , 2004, SIAM J. Comput..

[36]  Philip S. Yu,et al.  On demand classification of data streams , 2004, KDD.

[37]  Jiawei Han,et al.  CloseGraph: mining closed frequent graph patterns , 2003, KDD '03.

[38]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[39]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[40]  Jorma Rissanen,et al.  An MDL Framework for Data Clustering , 2005 .

[41]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[42]  Yixin Chen,et al.  Multi-Dimensional Regression Analysis of Time-Series Data Streams , 2002, VLDB.

[43]  Ramakrishnan Srikant,et al.  Mining Sequential Patterns: Generalizations and Performance Improvements , 1996, EDBT.

[44]  Dr. Bhavani Thuraisingham ASSURED INFORMATION SHARING VOLUME 1 : OVERVIEW , 2006 .

[45]  Jennifer Widom,et al.  Models and issues in data stream systems , 2002, PODS.

[46]  James Kelly,et al.  AutoClass: A Bayesian Classification System , 1993, ML.

[47]  Jennifer Widom,et al.  Continuous queries over data streams , 2001, SGMD.

[48]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[49]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[50]  Dimitrios Gunopulos,et al.  Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[51]  Jian Pei,et al.  Efficient computation of Iceberg cubes with complex measures , 2001, SIGMOD '01.

[52]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[53]  JOHANNES GEHRKE,et al.  RainForest—A Framework for Fast Decision Tree Construction of Large Datasets , 1998, Data Mining and Knowledge Discovery.

[54]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[55]  Wei-Yin Loh,et al.  A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms , 2000, Machine Learning.

[56]  Wolfgang Maass,et al.  Algorithms and Lower Bounds for On-Line Learning of Geometrical Concepts , 1994, Machine Learning.

[57]  Leonid Khachiyan,et al.  On the Complexity of Dualization of Monotone Disjunctive Normal Forms , 1996, J. Algorithms.

[58]  Jiawei Han,et al.  Discovering an Evolutionary Classifier over a High-speed Nonstatic Stream , 2005 .

[59]  Paul S. Bradley,et al.  Scaling Clustering Algorithms to Large Databases , 1998, KDD.

[60]  Jiawei Han,et al.  IncSpan: incremental mining of sequential patterns in large database , 2004, KDD.

[61]  Jeffrey F. Naughton,et al.  An array-based algorithm for simultaneous multidimensional aggregates , 1997, SIGMOD '97.

[62]  Philip S. Yu,et al.  A Framework for Projected Clustering of High Dimensional Data Streams , 2004, VLDB.

[63]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[64]  Sudipto Guha,et al.  Data-streams and histograms , 2001, STOC '01.

[65]  Jianyong Wang,et al.  Mining sequential patterns by pattern-growth: the PrefixSpan approach , 2004, IEEE Transactions on Knowledge and Data Engineering.

[66]  Daniel A. Keim,et al.  An Efficient Approach to Clustering in Large Multimedia Databases with Noise , 1998, KDD.

[67]  R. Hämäläinen,et al.  Hawaii International Conference on System Sciences , 2004 .

[68]  Sudipto Guha,et al.  CURE: an efficient clustering algorithm for large databases , 1998, SIGMOD '98.

[69]  Rajeev Rastogi,et al.  Processing complex aggregate queries over data streams , 2002, SIGMOD '02.