Heterogeneous Multimedia Database Selection on the Web

Multimedia databases on the Web have two distinct characteristics, autonomy and heterogeneity. They are established and maintained independently and queries are processed depending on their own schemes. In this paper, we investigate the problem of the database selection from a number of multimedia databases dispersed on the Web. In the multimedia objects retrieval from distributed sites, it is crucial that the metaserver has the capability to find objects, globally similar to a given query object, from different multimedia databases with different local similarity measures. We propose a novel selection algorithm to determine candidate databases that contain more objects relevant to the query than other databases. The selection of databases is based on the number of relevant objects at each local database that is estimated by using a few sample objects and the histogram information. An extensive experiment on a large number of real data sets demonstrates that our proposed method performs well i n the distributed heterogeneous environment. Keyword: multimedia database, database selection, heterogeneity, compressed histogram

[1]  Yuichi Kanai,et al.  Image segmentation using intensity and color information , 1997, Electronic Imaging.

[2]  King-Lup Liu,et al.  Determining Text Databases to Search in the Internet , 1998, VLDB.

[3]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[4]  W. Bruce Croft,et al.  Searching distributed collections with inference networks , 1995, SIGIR '95.

[5]  Luis Gravano,et al.  STARTS: Stanford proposal for Internet meta-searching , 1997, SIGMOD '97.

[6]  Shih-Fu Chang,et al.  Visually Searching the Web for Content , 1997, IEEE Multim..

[7]  Tim Oates,et al.  Efficient progressive sampling , 1999, KDD '99.

[8]  Amarnath Gupta,et al.  Virage image search engine: an open framework for image management , 1996, Electronic Imaging.

[9]  Deok-Hwan Kim,et al.  Multi-dimensional selectivity estimation using compressed histogram information , 1999, SIGMOD '99.

[10]  G. Reinsel,et al.  Introduction to Mathematical Statistics (4th ed.). , 1980 .

[11]  Shih-Fu Chang,et al.  MetaSEEk: a content-based metasearch engine for images , 1997, Electronic Imaging.

[12]  King-Lup Liu,et al.  Estimating the usefulness of search engines , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[13]  Luis Gravano,et al.  The Effectiveness of GlOSS for the Text Database Discovery Problem , 1994, SIGMOD Conference.

[14]  Shih-Fu Chang,et al.  VisualSEEk: a fully automated content-based image query system , 1997, MULTIMEDIA '96.

[15]  Frederick S. Hillier,et al.  Introduction of Operations Research , 1967 .

[16]  Thomas S. Huang,et al.  Supporting Ranked Boolean Similarity Queries in MARS , 1998, IEEE Trans. Knowl. Data Eng..