An Indexing Approach for Representing Multimedia Objects in High-Dimensional Spaces Based on Expectation Maximization Algorithm

In this paper we introduce a new indexing approach to representing multimedia object classes generated by the Expectation Maximization clustering algorithm in a balanced and dynamic tree structure. To this aim the EM algorithm has been modified in order to obtain at each step of its recursive application balanced clusters. In this manner our tree provides a simple and practical solution to index clustered data and support efficient retrieval of the nearest neighbors in high dimensional object spaces.

[1]  Joydeep Ghosh,et al.  A Unified Framework for Model-based Clustering , 2003, J. Mach. Learn. Res..

[2]  James C. French,et al.  Clustering large datasets in arbitrary metric spaces , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[3]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[4]  Aidong Zhang,et al.  ClusterTree: Integration of Cluster Representation and Nearest-Neighbor Search for Large Data Sets with High Dimensions , 2003, IEEE Trans. Knowl. Data Eng..

[5]  Jeffrey K. Uhlmann,et al.  Satisfying General Proximity/Similarity Queries with Metric Trees , 1991, Inf. Process. Lett..

[6]  Jian-Kang Wu Content-Based Indexing of Multimedia Databases , 1997, IEEE Trans. Knowl. Data Eng..

[7]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[8]  Sergey Brin,et al.  Near Neighbor Search in Large Metric Spaces , 1995, VLDB.

[9]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[10]  Ricardo A. Baeza-Yates,et al.  Searching in metric spaces , 2001, CSUR.

[11]  Angelo Chianese,et al.  Foveated shot detection for video segmentation , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[12]  Pavel Zezula,et al.  M-tree: An Efficient Access Method for Similarity Search in Metric Spaces , 1997, VLDB.

[13]  Donald B. Rubin,et al.  Max-imum Likelihood from Incomplete Data , 1972 .

[14]  Christian Böhm,et al.  Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases , 2001, CSUR.

[15]  Peter N. Yianilos,et al.  Data structures and algorithms for nearest neighbor search in general metric spaces , 1993, SODA '93.

[16]  Jeff A. Bilmes,et al.  A gentle tutorial of the em algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models , 1998 .

[17]  Walter A. Burkhard,et al.  Some approaches to best-match file searching , 1973, Commun. ACM.

[18]  G. Celeux,et al.  A Classification EM algorithm for clustering and two stochastic versions , 1992 .

[19]  Iraj Kalantari,et al.  A Data Structure and an Algorithm for the Nearest Point Problem , 1983, IEEE Transactions on Software Engineering.

[20]  Ricardo A. Baeza-Yates,et al.  Proximity Matching Using Fixed-Queries Trees , 1994, CPM.

[21]  Michael I. Jordan Learning in Graphical Models , 1999, NATO ASI Series.