Choosing the content of textual summaries of large time-series data sets

Natural Language Generation (NLG) can be used to generate textual summaries of numeric data sets. In this paper we develop an architecture for generating short (a few sentences) summaries of large (100KB or more) time-series data sets. The architecture integrates pattern recognition, pattern abstraction, selection of the most significant patterns, microplanning (especially aggregation), and realisation. We also describe and evaluate SumTime-Turbine, a prototype system which uses this architecture to generate textualsummaries of sensor data from gas turbines.

[1]  Inderjeet Mani,et al.  SUMMAC: a text summarization evaluation , 2002, Natural Language Engineering.

[2]  Kathleen McKeown Discourse Strategies for Using Natural-Language Text , 1985 .

[3]  Ehud Reiter,et al.  SumTime-Mousam: Configurable marine weather forecast generator , 2003 .

[4]  Abraham Silberschatz,et al.  What Makes Patterns Interesting in Knowledge Discovery Systems , 1996, IEEE Trans. Knowl. Data Eng..

[5]  Yuval Shahar,et al.  A Framework for Knowledge-Based Temporal Abstraction , 1997, Artif. Intell..

[6]  Ganesh S. Oak Information Visualization Introduction , 2022 .

[7]  Yuval Shahar,et al.  A knowledge-based method for temporal abstraction of clinical data , 1995 .

[8]  Jim Hunter,et al.  Recognising Visual Patterns to Communicate Gas Turbine Time-Series Data , 2003 .

[9]  Haixun Wang,et al.  Landmarks: a new model for similarity-based pattern querying in time series databases , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[10]  Richard I. Kittredge,et al.  Using natural-language processing to produce weather forecasts , 1994, IEEE Expert.

[11]  Ben Shneiderman,et al.  The eyes have it: a task by data type taxonomy for information visualizations , 1996, Proceedings 1996 IEEE Symposium on Visual Languages.

[12]  E Soto,et al.  A microcomputer program for automated neuronal spike detection and analysis. , 1997, International journal of medical informatics.

[13]  Sourmitra Dutta,et al.  Temporal reasoning in medical expert systems , 1988, Proceedings of the Symposium on the Engineering of Computer-Based Medical.

[14]  James Shaw Conciseness through Aggregation in Text Generation , 1995, ACL.

[15]  James Shaw,et al.  Practical Issues in Automatic Documentation Generation , 1994, ANLP.

[16]  Adam Kilgarriff,et al.  Introduction to the Special Issue on the Web as Corpus , 2003, CL.

[17]  Joseba Quevedo,et al.  TIGER: Knowledge Based Gas Turbine Condition Monitoring , 1996, AI Commun..

[18]  Kathleen McKeown,et al.  Discourse Strategies for Generating Natural-Language Text , 1985, Artif. Intell..

[19]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[20]  Yuval Shahar,et al.  Knowledge-based temporal abstraction in clinical domains , 1996, Artif. Intell. Medicine.

[21]  Mark T. Maybury,et al.  Generating Summaries from Event Data , 1995, Inf. Process. Manag..

[22]  Alain Polguère,et al.  Generation of Extended Bilingual Statistical Reports , 1992, COLING.

[23]  Karen Kukich,et al.  Knowledge-based report generation : a knowledge engineering approach to natural language report generation , 1983 .

[24]  Isaac S. Kohane TEMPORAL REASONING IN MEDICAL EXPERT SYSTEMS , 1987 .

[25]  Chris Mellish,et al.  Evaluation in the context of natural language generation , 1998, Comput. Speech Lang..

[26]  Anna S. Law,et al.  A Comparison of Graphical and Textual Presentations of Time Series Data to Support Medical Decision Making in the Neonatal Intensive Care Unit , 2005, Journal of Clinical Monitoring and Computing.

[27]  Emiel Krahmer,et al.  A generic algorithm for generating spoken monologues , 1998, ICSLP.

[28]  E. Reiter,et al.  Acquiring Correct Knowledge for Natural Language Generation , 2011, J. Artif. Intell. Res..

[29]  Lawrence M. Fagan,et al.  Combining Physiologic Models and Symbolic Methods to Interpret Time-Varying Patient Data* , 1991, Methods of Information in Medicine.

[30]  Yuval Shahar Knowledge-based temporal interpolation , 1999, J. Exp. Theor. Artif. Intell..

[31]  M A Musen,et al.  Knowledge reuse: temporal-abstraction mechanisms for the assessment of children's growth. , 1993, Proceedings. Symposium on Computer Applications in Medical Care.

[32]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[33]  Giuseppe Psaila,et al.  Querying Shapes of Histories , 1995, VLDB.

[34]  Lawrence M. Fagan,et al.  Extensions to the Time-Oriented Database Model to Support Temporal Reasoning in Medical Expert Systems , 1991, Methods of Information in Medicine.

[35]  Kenneth D. Miller,et al.  Cross-channel correlations in tetrode recordings: implications for spike-sorting , 1999, Neurocomputing.

[36]  Jim Hunter,et al.  Summarizing Neonatal Time Series Data , 2003, EACL.

[37]  Jim Hunter,et al.  Choosing words in computer-generated weather forecasts , 2005, Artif. Intell..

[38]  Robert Milne,et al.  TIGER with model based diagnosis: initial deployment , 2001, Knowl. Based Syst..

[39]  Ira J. Haimowitz,et al.  Automated Trend Detection with Alternate Temporal Hypotheses , 1993, IJCAI.

[40]  Anne E. Trefethen,et al.  The Data Deluge: An e-Science Perspective , 2003 .

[41]  Ehud Reiter,et al.  Book Reviews: Building Natural Language Generation Systems , 2000, CL.