Dating Documents using Graph Convolution Networks

Document date is essential for many important tasks, such as document retrieval, summarization, event detection, etc. While existing approaches for these tasks assume accurate knowledge of the document date, this is not always available, especially for arbitrary documents from the Web. Document Dating is a challenging problem which requires inference over the temporal structure of the document. Prior document dating systems have largely relied on handcrafted features while ignoring such document-internal structures. In this paper, we propose NeuralDater, a Graph Convolutional Network (GCN) based document dating approach which jointly exploits syntactic and temporal graph structures of document in a principled way. To the best of our knowledge, this is the first application of deep learning for the problem of document dating. Through extensive experiments on real-world datasets, we find that NeuralDater significantly outperforms state-of-the-art baseline by 19% absolute (45% relative) accuracy points.

[1]  James Pustejovsky,et al.  SemEval-2015 Task 5: QA TempEval - Evaluating Temporal Information Understanding with Question Answering , 2015, *SEMEVAL.

[2]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.

[3]  James Pustejovsky,et al.  SemEval-2007 Task 15: TempEval Temporal Relation Identification , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[4]  Khalil Sima'an,et al.  Graph Convolutional Encoders for Syntax-aware Neural Machine Translation , 2017, EMNLP.

[5]  James Pustejovsky,et al.  SemEval-2013 Task 1: TempEval-3: Evaluating Time Expressions, Events, and Temporal Relations , 2013, *SEMEVAL.

[6]  Taylor Cassidy,et al.  Dense Event Ordering with a Multi-Pass Architecture , 2014, TACL.

[7]  Nathanael Chambers,et al.  Jointly Combining Implicit Constraints Improves Temporal Ordering , 2008, EMNLP.

[8]  Nathanael Chambers,et al.  Labeling Documents with Timestamps: Learning from their Time Expressions , 2012, ACL.

[9]  Heeyoung Lee,et al.  Deterministic Coreference Resolution Based on Entity-Centric, Precision-Ranked Rules , 2013, CL.

[10]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[11]  Gerhard Weikum,et al.  As Time Goes By: Comprehensive Tagging of Textual Phrases with Temporal Scopes , 2016, WWW.

[12]  Dimitrios Gunopulos,et al.  On burstiness-aware search for document sequences , 2009, KDD.

[13]  Joan Bruna,et al.  Spectral Networks and Locally Connected Networks on Graphs , 2013, ICLR.

[14]  Inderjeet Mani,et al.  Robust Temporal Processing of News , 2000, ACL.

[15]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[16]  Vincent Ng,et al.  Classifying Temporal Relations with Rich Linguistic Knowledge , 2013, NAACL.

[17]  Paramita Mirza,et al.  CATENA: CAusal and TEmporal relation extraction from NAtural language texts , 2016, COLING.

[18]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[19]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[20]  Rafael Berlanga Llavori,et al.  Extracting Temporal References to Assign Document Event-Time Periods , 2001, DEXA.

[21]  Kjetil Nørvåg,et al.  Improving Temporal Language Models for Determining Time of Non-timestamped Documents , 2008, ECDL.

[22]  Yoshua Bengio,et al.  Object Recognition with Gradient-Based Learning , 1999, Shape, Contour and Grouping in Computer Vision.

[23]  Dimitrios Gunopulos,et al.  A burstiness-aware approach for document dating , 2014, SIGIR.

[24]  AllanJames,et al.  On-Line New Event Detection and Tracking , 2017 .

[25]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[26]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[27]  Xiaojun Wan TimedTextRank: adding the temporal dimension to multi-document summarization , 2007, SIGIR.

[28]  Diego Marcheggiani,et al.  Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling , 2017, EMNLP.

[29]  Shan Wang,et al.  Classifying Temporal Relations Between Events , 2007, ACL.

[30]  Tommaso Caselli,et al.  SemEval-2010 Task 13: TempEval-2 , 2010, *SEMEVAL.

[31]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[32]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[33]  James Allan,et al.  On-Line New Event Detection and Tracking , 1998, SIGIR.

[34]  Chiranjib Bhattacharyya,et al.  RESIDE: Improving Distantly-Supervised Neural Relation Extraction using Side Information , 2018, EMNLP.

[35]  Angel X. Chang,et al.  SUTime: A library for recognizing and normalizing time expressions , 2012, LREC.

[36]  Luis Gravano,et al.  Answering General Time-Sensitive Queries , 2012, IEEE Trans. Knowl. Data Eng..

[37]  Rui Zhang,et al.  Graph-based Neural Multi-Document Summarization , 2017, CoNLL.

[38]  Paramita Mirza,et al.  Classifying Temporal Relations with Simple Features , 2014, EACL.