UNED-READERS: Filtering Relevant Tweets using Probabilistic Signature Models

This paper describes the (usupervised) knowledge-based approach to filter relevant tweets for a given entity that is followed by the UNED-READERS system at RepLab 2013. The approach relies on a new way of contextualizing entity names from relative large and broad collections of texts using probabilistic signature models (i.e., discrete probability distributions of words lexically related to the knowledge or topic underlying set of entities in background text collections). The contextualization is intended to recover relevant information about the entity (specifically, lexically related words) from background knowledge. Results obtained in the filtering task are presented.