On modeling vagueness and uncertainty in data-to-text systems through fuzzy sets

Vagueness and uncertainty management is counted among one of the challenges that remain unresolved in systems that generate texts from non-linguistic data, known as data-to-text systems. In the last decade, work in fuzzy linguistic summarization and description of data has raised the interest of using fuzzy sets to model and manage the imprecision of human language in data-to-text systems. However, despite some research in this direction, there has not been an actual clear discussion and justification on how fuzzy sets can contribute to data-to-text for modeling vagueness and uncertainty in words and expressions. This paper intends to bridge this gap by answering the following questions: What does vagueness mean in fuzzy sets theory? What does vagueness mean in data-to-text contexts? In what ways can fuzzy sets theory contribute to improve data-to-text systems? What are the challenges that researchers from both disciplines need to address for a successful integration of fuzzy sets into data-to-text systems? In what cases should the use of fuzzy sets be avoided in D2T? For this, we review and discuss the state of the art of vagueness modeling in natural language generation and data-to-text, describe potential and actual usages of fuzzy sets in data-to-text contexts, and provide some additional insights about the engineering of data-to-text systems that make use of fuzzy set-based techniques.

[1]  Ernest W. Adams,et al.  A primer of probability logic , 1996 .

[2]  Eric P. Xing,et al.  Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , 2014, ACL 2014.

[3]  Slawomir Zadrozny,et al.  Computing With Words Is an Implementable Paradigm: Fuzzy Queries, Linguistic Data Summaries, and Natural-Language Generation , 2010, IEEE Transactions on Fuzzy Systems.

[4]  Robert LIN,et al.  NOTE ON FUZZY SETS , 2014 .

[5]  Dimitra Gkatzia,et al.  Comparing Multi-label Classification with Reinforcement Learning for Summarisation of Time-series Data , 2014, ACL.

[6]  Ehud Reiter,et al.  Selecting the Content of Textual Descriptions of Geographically Located Events in Spatio-Temporal Weather Data , 2007, SGAI Conf..

[7]  Ronald R. Yager,et al.  A new approach to the summarization of data , 1982, Inf. Sci..

[8]  Nava Tintarev,et al.  Natural language generation and fuzzy sets: An exploratory study on geographical referring expression generation , 2016, 2016 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

[9]  Ehud Reiter,et al.  Using Spatial Reference Frames to Generate Grounded Textual Summaries of Georeferenced Data , 2008, INLG.

[10]  Emiel Krahmer,et al.  Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation , 2017, J. Artif. Intell. Res..

[11]  Michael White,et al.  EXEMPLARS: A Practical, Extensible Framework For Dynamic Text Generation , 1998, INLG.

[12]  Daniel Sánchez,et al.  The Role of Graduality for Referring Expression Generation in Visual Scenes , 2016, IPMU.

[13]  Gustavo Rivas-Gervilla,et al.  Using specificity to measure referential success in referring expressions with fuzzy properties , 2016, 2016 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

[14]  Albert Gatt,et al.  Multilingual generation of uncertain temporal expressions from data: A study of a possibilistic formalism and its consistency with human subjective evaluations , 2016, Fuzzy Sets Syst..

[15]  Lotfi A. Zadeh,et al.  From Computing with Numbers to Computing with Words - from Manipulation of Measurements to Manipulation of Perceptions , 2005, Logic, Thought and Action.

[16]  Kees van Deemter What Game Theory Can Do for NLG: The Case of Vague Language (Invited Talk) , 2009, ENLG.

[17]  Slawomir Zadrozny,et al.  Fuzzy logic‐based linguistic summaries of time series: a powerful tool for discovering knowledge on time varying processes and systems under imprecision , 2016, WIREs Data Mining Knowl. Discov..

[18]  Amy Isard,et al.  Automatic Generation of Student Report Cards , 2016, INLG.

[19]  Enric Trillas,et al.  Towards the dissolution of the sorites paradox , 2011, Appl. Soft Comput..

[20]  J Eriksson Lessons from a failure : Generating tailored smoking cessation letters , 2003 .

[21]  Didier Dubois,et al.  Possibility Theory and Its Applications: Where Do We Stand? , 2015, Handbook of Computational Intelligence.

[22]  Kees van Deemter,et al.  Generating Vague Descriptions , 2000, INLG.

[23]  Senén Barro,et al.  Linguistic Descriptions for Automatic Generation of Textual Short-Term Weather Forecasts on Real Prediction Data , 2015, IEEE Trans. Fuzzy Syst..

[24]  Richard Power,et al.  Generating Numerical Approximations , 2012, Computational Linguistics.

[25]  Daniel Sánchez,et al.  Fuzzy quantification: a state of the art , 2014, Fuzzy Sets Syst..

[26]  Somayajulu Sripada,et al.  A Case Study: NLG meeting Weather Industry Demand for Quality and Quantity of Textual Weather Forecasts , 2014, INLG.

[27]  Daniel Sánchez,et al.  On generating linguistic descriptions of time series , 2016, Fuzzy Sets Syst..

[28]  A. Ramos-Soto,et al.  On the role of linguistic descriptions of data in the building of natural language generation systems , 2016, Fuzzy Sets Syst..

[29]  Kees van Deemter Generating Referring Expressions that Involve Gradable Properties , 2006, CL.

[30]  J. Lukasiewicz,et al.  ON THREE-VALUED LOGIC , 2016 .

[31]  Jim Hunter,et al.  An approach to generating summaries of time series data in the gas turbine domain , 2001, 2001 International Conferences on Info-Tech and Info-Net. Proceedings (Cat. No.01EX479).

[32]  Sundarapandian Vaidyanathan,et al.  Computational Intelligence Applications in Modeling and Control , 2015, Computational Intelligence Applications in Modeling and Control.

[33]  Ehud Reiter,et al.  An Architecture for Data-to-Text Systems , 2007, ENLG.

[34]  Kees van Deemter Finetuning NLG Through Experiments with Human Subjects: The Case of Vague Descriptions , 2004, INLG.

[35]  Petr Hájek,et al.  Handbook of mathematical fuzzy logic , 2011 .

[36]  Albert Gatt,et al.  Automatic generation of textual summaries from neonatal intensive care data , 2009 .

[37]  Lotfi A. Zadeh,et al.  Syllogistic reasoning in fuzzy logic and its application to usuality and reasoning with dispositions , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[38]  Lotfi A. Zadeh,et al.  Fuzzy logic = computing with words , 1996, IEEE Trans. Fuzzy Syst..

[39]  L. Zadeh Fuzzy sets as a basis for a theory of possibility , 1999 .

[40]  E. Reiter,et al.  Acquiring Correct Knowledge for Natural Language Generation , 2011, J. Artif. Intell. Res..

[41]  Henri Prade,et al.  Reaching Agreement Through Argumentation: A Possibilistic Approach , 2004, KR.

[42]  Kees van Deemter Utility and Language Generation: The Case of Vagueness , 2009, J. Philos. Log..

[43]  Ronald R. Yager,et al.  An overview of methods for linguistic summarization with fuzzy sets , 2016, Expert Syst. Appl..

[44]  Didier Dubois,et al.  The three semantics of fuzzy sets , 1997, Fuzzy Sets Syst..

[45]  Robert Dale,et al.  Computational Interpretations of the Gricean Maxims in the Generation of Referring Expressions , 1995, Cogn. Sci..

[46]  Ehud Reiter,et al.  SumTime-Mousam: Configurable marine weather forecast generator , 2003 .

[47]  Albert Gatt,et al.  Towards a Possibility-Theoretic Approach to Uncertainty in Medical Data Interpretation for Text Generation , 2009, KR4HC.

[48]  José M. Alonso,et al.  An empirical approach for modeling fuzzy geographical descriptors , 2017, 2017 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

[49]  Lotfi A. Zadeh,et al.  Computing with Words and Perceptions - A Paradigm Shift , 2009, PDPTA.

[50]  Helmut Horacek,et al.  Generating Air Quality Reports From Environmental Data , 1997 .