Microblogging platforms such as Twitter have become a primary medium for people to share their experiences and opinions on a broad range of topics. Because posts on Twitter are publicly viewable by default, Twitter can be used to get up-to-date information on events like natural disasters, disease outbreaks or sports events. Building a cohesive summary out of tweets on long running events is an interesting problem which research community is interested in. But the abundance of tweets containing user opinions and their sentiments towards a topic necessitates the need of extracting newsworthy tweets from a large stream of tweets on a single topic. But most of such methods require large hand-labeled corpora to be used for training the model. But this is not practical for a rapidly updating medium like Twitter. In this paper we address this problem with the introduction of a novel heuristic based annotation scheme to generate training dataset for the system. A hand-labeled corpus of tweets is only used for benchmarking the objectivity classifier. Our classifier could achieve an F1-score of 80% on a manually annotated gold standard dataset.
[1]
Ellen Riloff,et al.
User Type Classification of Tweets with Implications for Event Recognition
,
2014
.
[2]
Jason Weston,et al.
Natural Language Processing (Almost) from Scratch
,
2011,
J. Mach. Learn. Res..
[3]
Jonathon Read,et al.
Using Emoticons to Reduce Dependency in Machine Learning Techniques for Sentiment Classification
,
2005,
ACL.
[4]
Julio Gonzalo,et al.
Towards real-time summarization of scheduled events from twitter streams
,
2012,
HT '12.
[5]
Qiang Yang,et al.
Cross-domain sentiment classification via spectral feature alignment
,
2010,
WWW '10.
[6]
Ronan Collobert,et al.
Deep Learning for Efficient Discriminative Parsing
,
2011,
AISTATS.
[7]
Deepayan Chakrabarti,et al.
Event Summarization Using Tweets
,
2011,
ICWSM.
[8]
Hong Yu,et al.
Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying the Polarity of Opinion Sentences
,
2003,
EMNLP.