ONOMATOPEDIA: Onomatopoeia Online Example Dictionary System Extracted from Data on the Web

Japanese is filled with onomatopoeia words, which describe sounds or actions like "click" or "bow-wow." In general, mastering onomatopoeia phrases is hard for foreign speakers, and example-based dictionaries are known to be useful for learning Japanese onomatopoeia. To construct such dictionaries, we need to collect as many examples as possible. This paper proposes an online onomatopoeia example-based dictionary named ONOMATOPEDIA, which comprises extensive example sentences collected from the Web. Inappropriate sentences are often included in web search results, for example, sentences that contain onomatopoeia words used as nick-names, or sentences that include uncommon usage patterns. We propose a model for extracting appropriate sentences as learning examples. Further, we propose a clustering algorithm for sentences having onomatopoeia that takes into account onomatopoeic words that could be used in different meanings depending on the context.