论文信息 - Proceedings of the Workshop on Events in Emerging Text Types

Proceedings of the Workshop on Events in Emerging Text Types

The proliferation of the Internet has revolutionised the way information is disseminated and presented. Blogs no longer just relay and comment on news stories but also influence what is talked about in the news. Such changes have not gone unnoticed by the computational linguistics research community, which is increasingly processing or exploiting blogs in an attempt to keep track of what is going on and mine information. This workshop focuses on how events can be identified and how information related to event processing (e.g. NP coreference, temporal processing) can be extracted from blogs and other online sources. Emphasis is on how existing methods for event processing need to be adapted in order to process this medium, and on linguistic differences in the reporting of events in blogs and more traditional news texts. Event detection and processing is not a new topic in computational linguistics, but until now it has focused mainly on processing of newswire. The TimeBank corpus (Pustejovsky et. al. 2003), the AQUAINT TimeML corpus, and the NP4E corpus (Hasler, Orasan and Naumann 2006) exclusively contain newswire, which may make them inappropriate for the development of methods which need to process other text types. Moreover, the informal style and structure of most blog entries makes event detection in these documents a difficult task. This workshop gives researchers the opportunity to present efforts to develop resources related to event identification and processing using blog entries, including annotation guidelines and linguistic analyses of such resources.

Constantin Orasan | Laura Hasler | Corina Forascu