Similarity is utilized in the retrieval and extraction of information, but it can also be used in dialog processing. Spoken dialog processing must deal with speech recognition error, interjections and noise, and it is rare that the same expressions are used consistently. It is required to find a sentence which is similar to the input sentence while taking account of these phenomena. This paper proposes an identification method for the question sentence based on TF⋅AoI (term frequency × amount of information) weighting. In this method, the words contained in the input sentence are weighted by (word similarity) × (amount of information). Then, based on the calculated Euclidean distance, the response corresponding to the question with the highest similarity is output. Comparison experiments verify an improvement of 13 points over the method of comparison by matching ratio to the input sentence, and by 6.5 points over the method of “similarity by TF⋅AoI weighting.” © 2007 Wiley Periodicals, Inc. Syst Comp Jpn, 38(10): 81–94, 2007; Published online in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/scj.20363
[1]
Tsuneaki Kato,et al.
Question Answering Challenge (QAC-1): An Evaluation of Question Answering Tasks at the NTCIR Workshop 3
,
2003,
New Directions in Question Answering.
[2]
Yasuo Horiuchi,et al.
Estimating Syntactic Structure from Prosody in Japanese Speech
,
2003
.
[3]
Hitoshi Iida,et al.
A speech and language database for speech translation research
,
1994,
ICSLP.
[4]
Victor Zue,et al.
JUPlTER: a telephone-based conversational interface for weather information
,
2000,
IEEE Trans. Speech Audio Process..
[5]
Marilyn A. Walker,et al.
Evaluating Response Strategies in a Web-Based Spoken Dialogue Agent
,
1998,
ACL.
[6]
Hideki Hashimoto,et al.
A real-time speech dialogue system using spontaneous speech understanding
,
1992,
ICSLP.
[7]
Joseph Weizenbaum,et al.
and Machine
,
1977
.