Bitmap Index-Based Decision Trees

In this paper we propose an original approach to apply data mining algorithms, namely decision tree-based methods, taking into account not only the size of processed databases but also the processing time. The key idea consists in constructing a decision tree, within the DBMS, using bitmap indices. Indeed bitmap indices have many useful properties such as the count and bit-wise operations. We will show that these operations are efficient to build decision trees. In addition, by using bitmap indices, we don't need to access raw data. This implies clear improvements in terms of processing time.

[1]  Shusaku Tsumoto,et al.  Foundations of Intelligent Systems, 15th International Symposium, ISMIS 2005, Saratoga Springs, NY, USA, May 25-28, 2005, Proceedings , 2005, ISMIS.

[2]  Sunita Sarawagi,et al.  Integrating Mining with Relational Database Systems: Alternatives and Implications. , 1998, SIGMOD 1998.

[3]  Surajit Chaudhuri Data Mining and Database Systems: Where is the Intersection? , 1998, IEEE Data Eng. Bull..

[4]  Ricco Rakotomalala,et al.  A New Sampling Strategy for Building Decision Trees from Large Databases , 2000 .

[5]  F. Bentayeb Efficient integration of data mining techniques in DBMSs , 2004 .

[6]  Carlo Zaniolo,et al.  ATLAS: A Small but Complete SQL Extension for Data Mining and Data Streams , 2003, VLDB.

[7]  Nandit Soparkar,et al.  Data organization and access for efficient data mining , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[8]  JOHANNES GEHRKE,et al.  RainForest—A Framework for Fast Decision Tree Construction of Large Datasets , 1998, Data Mining and Knowledge Discovery.

[9]  Hannu Toivonen,et al.  Sampling Large Databases for Association Rules , 1996, VLDB.

[10]  Wei Wang,et al.  DMQL: A Data Mining Query Language for Relational Databases , 2007 .

[11]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[12]  Tomasz Imielinski,et al.  MSQL: A Query Language for Database Mining , 1999, Data Mining and Knowledge Discovery.

[13]  Giuseppe Psaila,et al.  A New SQL-like Operator for Mining Association Rules , 1996, VLDB.

[14]  Fadila Bentayeb,et al.  Efficient integration of data mining techniques in database management systems , 2004, Proceedings. International Database Engineering and Applications Symposium, 2004. IDEAS '04..

[15]  K. Sattler,et al.  Towards Data Mining Operators in Database Systems : Algebra and Implementation ? , 2002 .

[16]  Hiroshi Motoda,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998, The Springer International Series in Engineering and Computer Science.

[17]  Sunita Sarawagi,et al.  Integrating association rule mining with relational database systems: alternatives and implications , 1998, SIGMOD '98.

[18]  Fadila Bentayeb,et al.  Decision Tree Modeling with Relational Views , 2002, ISMIS.

[19]  Ganesh Ramesh,et al.  Indexing and Data Access Methods for Database Mining , 2002, DMKD.

[20]  Heikki Mannila,et al.  Fast Discovery of Association Rules , 1996, Advances in Knowledge Discovery and Data Mining.