Many organizations today have more than very large databases. The databases also grow without limit at a rate of several million records per day. Data streams are ubiquitous and have become an important research topic in the last two decades. Mining these continuous data streams brings unique opportunities, but also new challenges. For their predictive nonparametric analysis, Hoeffding-based trees are often a method of choice, which offers a possibility of any-time predictions. Although one of their main problems is the delay in learning progress due to the presence of equally discriminative attributes. Options are a natural way to deal with this problem. In this paper, Option trees which build upon regular trees is presented by adding splitting options in the internal nodes to improve accuracy, stability and reduce ambiguity. Adaptive Hoeffding option tree algorithm is reviewed and results based on accuracy and processing speed of algorithm under various memory limits is presented. The accuracy of Hoeffding Option tree is compared with Hoeffding trees and adaptive Hoeffding option tree under circumstantial conditions . Keywords: data stream, hoeffding trees, option trees, adaptive hoeffding option trees, large databases
[1]
Eric Bauer,et al.
An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants
,
1999,
Machine Learning.
[2]
Geoff Hulten,et al.
Mining time-changing data streams
,
2001,
KDD '01.
[3]
Ron Kohavi,et al.
Option Decision Trees with Majority Votes
,
1997,
ICML.
[4]
Geoff Holmes,et al.
New Options for Hoeffding Trees
,
2007,
Australian Conference on Artificial Intelligence.
[5]
Rakesh Agrawal,et al.
SPRINT: A Scalable Parallel Classifier for Data Mining
,
1996,
VLDB.
[6]
Geoff Hulten,et al.
A General Framework for Mining Massive Data Streams
,
2003
.
[7]
Jorma Rissanen,et al.
SLIQ: A Fast Scalable Classifier for Data Mining
,
1996,
EDBT.
[8]
Richard Brendon Kirkby,et al.
Improving Hoeffding Trees
,
2007
.
[9]
Geoff Hulten,et al.
Mining high-speed data streams
,
2000,
KDD '00.