Automatic construction of known-item finding test beds

This work is an initial study on the utility of automatically generated queries for evaluating known-item retrieval and how such queries compare to real queries. The main advantage of automatically generating queries is that for any given test collection numerous queries can be produced at minimal cost. For evaluation, this has huge ramifications as state-of-the-art algorithms can be tested on different types of generated queries which mimic particular querying styles that a user may adopt. Our approach draws upon previous research in IR which has probabilistically generated simulated queries for other purposes [2, 3].