Topic detection in broadcast news
暂无分享,去创建一个
We propose a system for the Topic Detection and Tracking (TDT) detection task concerned with the unsupervised grouping of news stories according to topic. We use an incremental k-means algorithm for clustering stories. For comparing stories, we utilize a probabilistic document similarity metric and a traditional vector-space metric. We note that that the clustering algorithm requires two different types of metrics and adapt similarity metrics for each purpose. The system achieves a topic-weighted miss rate of 12% at a false accept rate of 0.22%.
[1] Gerard Salton,et al. Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .
[2] Richard M. Schwartz,et al. A maximum likelihood model for topic classification of broadcast news , 1997, EUROSPEECH.
[3] Larry Wall,et al. Programming Perl , 1991 .
[4] 尚弘 島影. National Institute of Standards and Technologyにおける超伝導研究及び生活 , 2001 .