Effective collection metasearch in a hierarchical environment: global vs. localized retrieval performance

We compare standard global IR searching with user-centric localized techniques to address the database selection problem. We conduct a series of experiments to compare the retrieval effectiveness of three separate search modes applied to a hierarchically structured data environment of textual database representations. The data environment is represented as a tree-like directory containing over 15,000 unique databases and over 100,000 total leaf nodes. Our search modes consist of varying degrees of browse and search, from a global search at the root node to a refined search at a sub-node using dynamically-calculated inverse document frequencies (idfs) to score candidate databases for probable relevance. Our findings indicate that a browse and search approach that relies upon localized searching from sub-nodes is capable of producing the most effective results.