On the Evaluation of Tweet Timeline Generation Task

Tweet Timeline Generation (TTG) task aims to generate a timeline of relevant but novel tweets that summarizes the development of a given topic. A typical TTG system first retrieves tweets then detects novel tweets among them to form a timeline. In this paper, we examine the dependency of TTG on retrieval quality, and its effect on having biased evaluation. Our study showed a considerable dependency, however, ranking systems is not highly affected if a common retrieval run is used.