Intuitively, words forming phrases are a more precise description of content than words as a sequence of keywords. Yet, evidence that phrases would be more effective for information retrieval is inconclusive. This paper isolates a neglected class of phrases, that is abundant in eommuuication, has an established theoretical foundation, and shows promise for an effective expression of the user's information need: the noun-noun compound (NNC). In an experiment, a variety of meaningful NNCs were used to isolate relevant passages in a large and varied corpus. In a first pass, passages were retrieved based on textual proximity of the words or their semantic peers. A second pass retained only passages containing a syntactically coherent structure equivalent to the original NNC. This second pass showed a dramatic increase in precision. Preliminary resuits show the validity of our intuition about phrases in the special but very productive case of NNCs.
[1]
John Riedl,et al.
TREC-3: Experience With Conceptual Relations in Information Retrieval
,
1994,
TREC.
[2]
Lambert Schomaker,et al.
Supporting content retrieval from WWW via “basic level categories” (poster abstract)
,
1999,
SIGIR '99.
[3]
M.H.W. Coolen.
The semantic processing of isolated novel nominal compounds
,
1995
.
[4]
Rada Mihalcea,et al.
Using WordNet and Lexical Operators to Improve Internet Searches
,
2000,
IEEE Internet Comput..
[5]
W. Bruce Croft.
Effective Text Retrieval Based on Combining Evidence from the Corpus and Users
,
1995,
IEEE Expert.
[6]
Sung-Hyon Myaeng,et al.
DR-LINK: A System Update for TREC-2
,
1993,
TREC.