Effective Bitmap Indexing for Non-metric Similarities

An increasing number of applications include recommender systems that have to perform search in a non-metric similarity space, thus creating an increasing demand for efficient yet flexible indexing techniques to facilitate similarity search. This demand is further fueled by the growing volume of data available to recommender systems. This paper addresses the demand in the specific domain of music recommendation. The paper presents the Music On Demand framework where music retrieval is performed in a continuous, stream-based fashion. Similarity measures between songs, which are computed on high-dimensional feature spaces, often do not obey the triangular inequality, meaning that existing indexing techniques for high-dimensional data are infeasible. The most prominent contribution of the paper is the proposal of an indexing approach that is effective for non-metric similarities. This is achieved by using a number of bitmap indexes combined with effective bitmap compression techniques. Experiments show that the approach scales well.

[1]  Daniel P. W. Ellis,et al.  Song-Level Features and Support Vector Machines for Music Classification , 2005, ISMIR.

[2]  Gerhard Widmer,et al.  Improvements of Audio-Based Music Similarity and Genre Classificaton , 2005, ISMIR.

[3]  François Pachet,et al.  Music Similarity Measures: What's the use? , 2002, ISMIR.

[4]  Dominik Lübbers SoniXplorer: Combining Visualization and Auralization for Content-Based Exploration of Music Collections , 2005, ISMIR.

[5]  Abraham Silberschatz,et al.  Database System Concepts , 1980 .

[6]  Ralph Kimball,et al.  The Data Warehouse Lifecycle Toolkit: Expert Methods for Designing, Developing and Deploying Data Warehouses with CD Rom , 1998 .

[7]  Andreas Rauber,et al.  PlaySOM and PocketSOMPlayer, Alternative Interfaces to Large Music Collections , 2005, ISMIR.

[8]  Tim Pohle,et al.  Dynamic Playlist Generation Based on Skipping Behavior , 2005, ISMIR.

[9]  Elias Pampalk Speeding Up Music Similarity , 2005 .

[10]  Tim Pohle,et al.  GENERATING SIMILARITY-BASED PLAYLISTS USING TRAVELING SALESMAN ALGORITHMS , 2005 .

[11]  Beth Logan,et al.  A music similarity function based on signal analysis , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[12]  François Pachet,et al.  Improving Timbre Similarity : How high’s the sky ? , 2004 .

[13]  Erich Schikuta,et al.  Improving the Performance of High-Energy Physics Analysis through Bitmap Indices , 2000, DEXA.

[14]  Torben Bach Pedersen,et al.  Multidimensional Database Technology , 2001, Computer.

[15]  Torben Bach Pedersen,et al.  A Data and Query Model for Dynamic Playlist Generation , 2007, 2007 IEEE 23rd International Conference on Data Engineering Workshop.

[16]  Bingjun Zhang,et al.  CompositeMap: a novel framework for music similarity measure , 2009, SIGIR.

[17]  Patrick E. O'Neil,et al.  Improved query performance with variant indexes , 1997, SIGMOD '97.

[18]  Yannis E. Ioannidis,et al.  Bitmap index design and evaluation , 1998, SIGMOD '98.

[19]  Pavel Zezula,et al.  M-tree: An Efficient Access Method for Similarity Search in Metric Spaces , 1997, VLDB.

[20]  Arie Shoshani,et al.  Optimizing bitmap indices with efficient compression , 2006, TODS.

[21]  Abraham Silberschatz,et al.  Database Systems Concepts , 1997 .

[22]  Ralph Kimball,et al.  The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling , 1996 .

[23]  Erik Thomsen,et al.  OLAP Solutions - Building Multidimensional Information Systems , 1997 .

[24]  Mario A. Nascimento,et al.  High-Dimensional Similarity Searches Using A Metric Pseudo-Grid , 2005, 21st International Conference on Data Engineering Workshops (ICDEW'05).

[25]  Goetz Graefe,et al.  Multi-table joins through bitmapped join indices , 1995, SGMD.

[26]  Wenfei Fan,et al.  Keys with Upward Wildcards for XML , 2001, DEXA.