Query Structuring with Two-Stage Term Dependence in the Japanese Language

We investigate the effectiveness of query structuring in the Japanese language by composing or decomposing compound words and phrases. Our method is based on a theoretical framework using Markov random fields. Our two-stage term dependence model captures both the global dependencies between query components explicitly delimited by separators in a query, and the local dependencies between constituents within a compound word when the compound word appears in a query component. We show that our model works well, particularly when using query structuring with compound words, through experiments using a 100-gigabyte web document collection mostly written in Japanese.