Although today's web search engines are very powerful, they still fail to provide intuitively relevant results for many types of queries, especially ones that are vaguely-formed in the user's own mind. We argue that associations between terms in a search query can reveal the underlying information needs in the users' mind and should be taken into account in search. We propose a multi-faceted approach to detect and exploit such associations. The CORDER method measures the association strength between query terms, and queries consisting of terms having low association strength with each other are seen as 'vague queries'. For a vague query, we use WordNet to find related terms of the query terms to compose extended queries, relying especially on the role of least common subsumers (LCS). We use relation strength between terms calculated by the CORDER method to refine these extended queries. Finally, we use the Hyperspace Analogue to Language (HAL) model and information flow (IF) method to expand these refined queries. Our initial experimental results on a corpus of 500 books from Amazon shows that our approach can find the right books for users given authentic vague queries, even in those cases where Google and Amazon's own book search fail.
[1]
Peter Bruza,et al.
Discovering information flow suing high dimensional conceptual space
,
2001,
SIGIR '01.
[2]
G. Bower,et al.
Human Associative Memory
,
1973
.
[3]
Curt Burgess,et al.
Producing high-dimensional semantic spaces from lexical co-occurrence
,
1996
.
[4]
Ted Pedersen,et al.
WordNet::Similarity - Measuring the Relatedness of Concepts
,
2004,
NAACL.
[5]
Enrico Motta,et al.
Mining Web data for competency management
,
2005,
The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05).
[6]
Philip Resnik,et al.
Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language
,
1999,
J. Artif. Intell. Res..