The Dimension-Join: A New Index for Data Warehouses

There are several auxiliary pre-computed access structures that allow faster answers by reading less base data. Examples are materialized views, join indexes, B-tree and bitmap indexes. This paper proposes dimension-join, a new type of index especially suited for data warehouses. The dimension-join borrows ideas from several concepts. It is a bitmap index, it is a multi-table join and when being used one of the tables is not read to improve performance. It is a multi-table join because it holds information belonging to two tables, which is similar to the join index proposed by Valduriez. However, instead of being composed by the tables’ primary keys, the dimension-join index is a bitmap index over the fact table using values from a dimension column. The dimension-join index is very useful when selecting facts depending on dimension tables belonging to snowflakes. The dimension-join represents a direct connection between the fact table and a table in the snowflake that can avoid several joins and produce enormous performance improvements. This paper also evaluates experimentally the dimension-join indexes using the TPC-H benchmark and shows that this new index structure can dramatically improve the performance for some queries.

[1]  Sven Helmer,et al.  Diag-Join: An Opportunistic Join Algorithm for 1:N Relationships , 1998, VLDB.

[2]  Kenneth A. Ross,et al.  Fast joins using join indices , 1999, The VLDB Journal.

[3]  Goetz Graefe,et al.  Multi-table joins through bitmapped join indices , 1995, SGMD.

[4]  Hans-Joachim Lenz,et al.  Tree Based Indexes vs. Bitmap Indexes - a Performance Study , 1999, DMDW.

[5]  Nick Roussopoulos,et al.  Materialized views and data warehouses , 1998, SGMD.

[6]  Raghu Ramakrishnan,et al.  Database Management Systems , 1976 .

[7]  Ralph Kimball,et al.  The Data Warehouse Toolkit: Practical Techniques for Building Dimensional Data Warehouses , 1996 .

[8]  Yannis E. Ioannidis,et al.  Bitmap index design and evaluation , 1998, SIGMOD '98.

[9]  Margaret H. Dunham,et al.  Join processing in relational databases , 1992, CSUR.

[10]  Laura M. Haas,et al.  Seeking the truth about ad hoc join costs , 1997, The VLDB Journal.

[11]  Patrick Valduriez,et al.  Join indices , 1987, TODS.

[12]  Nancy L. Martin,et al.  Join index, materialized view, and hybrid-hash join: a performance analysis , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.

[13]  Bonnie O'Neil,et al.  Oracle Data Warehousing , 1997 .

[14]  Kjell Bratbergsengen,et al.  Hashing Methods and Relational Algebra Operations , 1984, VLDB.

[15]  R. Bayer,et al.  Organization and maintenance of large ordered indices , 1970, SIGFIDET '70.

[16]  Hidehiko Tanaka,et al.  Relational Algebra Machine GRACE , 1982, RIMS Symposium on Software Science and Engineering.