Recognizing Tweet Relevance with Profile-specific and Profile-independent Supervised Models

In the 2017 TREC (Text Retrieval Conference) Real-Time Summarization (RTS) track, we explored supervised methods for identifying relevant tweets based on a user’s interest profile. We primarily focused on two approaches: profile-specific and profile-independent. For profile-specific, we trained a model for each interest profile with features specific to the target profile. In case of profileindependent, a single model was trained with features that were general across all profiles. For training the supervised models, we used labeled data from the previous year’s challenge. We additionally introduced a novel method for automatically labeling tweets with relevance scores. The method treated keywords from titles as an essential information and penalized the relevance score for a tweet when the keywords were absent; while treating keywords from description as supporting information, and rewarding the relevance score when these keywords were present. In scenario A (real-time push notification), our best run yielded 9.95% EG-p and 11.11% nDCG-p improvements over the median in batch evaluation. In scenario B (daily digest), our best run achieved 25.43% nDCGp improvement over the median.