Fast Incremental Clustering of Gaussian Mixture Speaker Models for Scaling up Retrieval In On-Line Broadcast

In this paper, we introduce a hierarchical classification approach in the incremental framework of speaker indexing. The technique of incremental generation of speaker-homogeneous segments is applied in the first phase. Then, we propose a hierarchical classification approach that applied in the speaker indexing framework. This approach benefits from the efficiency of Gaussian mixture model (GMM) merge algorithm to the high accuracy of update Gaussian mixture models which referenced by speakers tree index. The adaptive threshold algorithm reduces the cost of exploring the speakers GMM into the balanced binary tree of speaker's index, whose complexity becomes logarithmic curve

[1]  Jacob Goldberger,et al.  Hierarchical Clustering of a Mixture Model , 2004, NIPS.

[2]  Douglas E. Sturim,et al.  Speaker indexing in large audio databases using anchor models , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[3]  Stéphane H. Maes,et al.  Very large population text-independent speaker identification using transformation enhanced multi-grained models , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[4]  Herbert Gish,et al.  Clustering speakers by their voices , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[5]  Driss Aboutajdine,et al.  Hierarchical organization of a set of Gaussian mixture speaker models for scaling up indexing and retrieval in audio documents , 2006, SAC '06.

[6]  Delphine Charlet,et al.  Speaker identification by location in an optimal space of anchor models , 2002, INTERSPEECH.

[7]  Marc Gelgon,et al.  Structuring and Querying Documents in an Audio Database Management System , 2004, Multimedia Tools and Applications.