Simulation of Within-Session Query Variations Using a Text Segmentation Approach

We propose a generative model for automatic query reformulations from an initial query using the underlying subtopic structure of top ranked retrieved documents. We address three types of query reformulations: a) specialization; b) generalization; and c) drift. To test our model we generate three reformulation variants starting with selected fields from the TREC-8 topics as the initial queries. We use manual judgements from multiple assessors to measure the accuracy of the reformulated query variants and observe accuracies of 65%, 82% and 69% respectively for specialization, generalization and drift reformulations.