论文信息 - Learning to Summarize Time Series Data

Learning to Summarize Time Series Data

In this paper we focus on content selection for summarizing time series data using Machine Learning techniques. The goal is to exploit a parallel corpus to predict the appropriate level of abstraction required for a summarization task. This is an important step towards building an automated NLG Natural Language Generation system to generate text for unseen data. Machine learning approaches are used to induce the underlying rules for text summarization, which are potentially close to the ones that humans use to generate textual summaries. We present an approach to select important points in a time series that can aid in generating captions or textual summaries. We evaluate our techniques on a parallel corpus of human generated weather forecast text corresponding to numerical weather prediction data.

Sutanu Chakraborti | Somayajulu Sripada | Pranay Kumar Venkata Sowdaboina

[1] Jim Hunter,et al. Segmenting Time Series for Weather Forecasting , 2003 .

[2] Robert Dale,et al. Building applied natural language generation systems , 1997, Natural Language Engineering.

[3] Kathleen McKeown,et al. Statistical Acquisition of Content Selection Rules for Natural Language Generation , 2003, EMNLP.

[4] Ian H. Witten,et al. The WEKA data mining software: an update , 2009, SKDD.

[5] Nikiforos Karamanis,et al. Investigating Content Selection for Language Generation using Machine Learning , 2009, ENLG.

[6] Jim Hunter,et al. Modelling the Task of Summarising Time Series Data Using KA Techniques , 2002 .

[7] Jim Hunter,et al. Exploiting a parallel TEXT - DATA corpus , 2003 .

[8] David D. Jensen,et al. Mining of Concurrent Text and Time Series , 2008 .

[9] Richard I. Kittredge,et al. Using natural-language processing to produce weather forecasts , 1994, IEEE Expert.

[10] Hannu Toivonen,et al. Estimating the number of segments in time series data using permutation tests , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[11] Ehud Reiter,et al. Learning the Meaning and Usage of Time Phrases from a Parallel Text-Data Corpus , 2003, HLT-NAACL 2003.

[12] E. Reiter,et al. Acquiring Correct Knowledge for Natural Language Generation , 2011, J. Artif. Intell. Res..

[13] Ehud Reiter,et al. Lessons from a failure: Generating tailored smoking cessation letters , 2003, Artif. Intell..