An online customizable music retrieval system with a spoken dialogue interface

In this paper, we introduce a spoken language interface for music information retrieval. In response to voice commands, the system searches for a song through an internet music shop or a ‘‘playlist’’ stored in the local PC; the system then plays it. To cope with the almost unlimited size of the vocabulary, a remote server program with which a user can customize their recognition grammar and dictionary is implemented. When a user selects favorite artists, the server program automatically generates a minimal set of recognition grammars and a dictionary. The system then sends them to the interface program. Therefore, on average, the vocabulary is less than 1000 words for each user. To perform a field test of the system, we implemented a speech collection capability, whereby speech utterances are compressed in free lossless audio codec (FLAC) format and are sent back to the server program with dialogue logs. Currently, the system is available to the public for experimental use. More than 100 users are involve...