The LHCb experiment [1] needs to store all of the information about datasets and their processing history from particle collisions at the Large Hadron Collider at CERN [2], as well as for simulated data. To achieve this functionality, a database design based on data warehousing techniques has been chosen, where several user-services can be implemented and optimized individually without losing functionality or performance. This approach results in an experiment-independent and flexible system. It allows fast access to the catalogue of available data, to detailed history information and to the catalogue of data replicas. Queries can be made based on these three sets of information. A flexible underlying database schema allows the implementation and evolution of these services without the need to change the basic database schema. The consequent implementation of interfaces based on XML-RPC allows access and modification of the stored information using a welldefined encapsulating API. In this document, we discuss the definition of metadata describing datasets and how these data are used by LHCb physicists to retrieve and access data from simulated particle collisions