Lessons from a failure: Generating tailored smoking cessation letters

STOP is a Natural Language Generation (NLG) system that generates short tailored smoking cessation letters, based on responses to a four-page smoking questionnaire. A clinical trial with 2553 smokers showed that STOP was not effective; that is, recipients of a non-tailored letter were as likely to stop smoking as recipients of a tailored letter. In this paper we describe the STOP system and clinical trial. Although it is rare for AI papers to present negative results, we believe that useful lessons can be learned from STOP. We also believe that the AI community as a whole could benefit from considering the issue of how, when, and why negative results should be reported; certainly a major difference between AI and more established fields such as medicine is that very few AI papers report negative results.

[1]  V. Strecher,et al.  The effects of computer-tailored smoking cessation messages in family practice settings. , 1994, The Journal of family practice.

[2]  Paul Aveyard,et al.  Cluster randomised controlled trial of expert system based on the transtheoretical (“stages of change”) model for smoking prevention and cessation in schools , 1999, BMJ.

[3]  Cécile Paris,et al.  An Evaluation of Procedural Instructional Text , 2002, INLG.

[4]  R. Michael Young,et al.  Using Grice's maxim of Quantity to select the content of plan descriptions , 1999, Artif. Intell..

[5]  Paul R. Cohen,et al.  Empirical methods for artificial intelligence , 1995, IEEE Expert.

[6]  V. Strecher Computer-tailored smoking cessation materials: a review and discussion. , 1999, Patient education and counseling.

[7]  Ehud Reiter,et al.  Knowledge Acquisition for Natural Language Generation , 2000, INLG.

[8]  Srinivas Bangalore,et al.  Evaluation Metrics for Generation , 2000, INLG.

[9]  J Pearson,et al.  Randomised trial of personalised computer based information for cancer patients , 1999, BMJ.

[10]  Alison Cawsey,et al.  The Evaluation of a Personalised Health Information System for Patients with Cancer , 2000, User Modeling and User-Adapted Interaction.

[11]  J O Prochaska Stages of change model for smoking prevention and cessation in schools , 2000, BMJ : British Medical Journal.

[12]  David C. Wilkins,et al.  Readings in Knowledge Acquisition and Learning: Automating the Construction and Improvement of Expert Systems , 1992 .

[13]  Michael White,et al.  EXEMPLARS: A Practical, Extensible Framework For Dynamic Text Generation , 1998, INLG.

[14]  M. Kendall,et al.  The Logic of Scientific Discovery. , 1959 .

[15]  Mark Johnson,et al.  Joint and Conditional Estimation of Tagging and Parsing Models , 2001, ACL.

[16]  Gary James Jason,et al.  The Logic of Scientific Discovery , 1988 .

[17]  Ehud Reiter,et al.  Book Reviews: Building Natural Language Generation Systems , 2000, CL.

[18]  D. Rennie,et al.  Publication bias in editorial decision making. , 2002, JAMA.

[19]  D S Bental,et al.  Patient information systems that tailor to the individual. , 1999, Patient education and counseling.

[20]  K. Dickersin,et al.  NIH clinical trials and publication bias. , 1993, The Online journal of current clinical trials.

[21]  Johanna D. Moore,et al.  An Empirical Study of the Influence of Argument Conciseness on Argument Effectiveness , 2000, ACL.

[22]  John Haugeland Mind design , 1985 .

[23]  Kim Binsted,et al.  Generating Personalised Patient Information Using the Medical Record , 1995, AIME.

[24]  Rudolf Peierls,et al.  The physicists , 1983, Nature.

[25]  J O Prochaska,et al.  Stages of change in the modification of problem behaviors. , 1992, Progress in behavior modification.

[26]  Ehud Reiter,et al.  Using a Randomised Controlled Clinical Trial to Evaluate an NLG System , 2001, ACL.

[27]  A. Carlisle Scott,et al.  Practical guide to knowledge acquisition , 1991 .

[28]  M. Law,et al.  An analysis of the effectiveness of interventions intended to help people stop smoking. , 1995, Archives of internal medicine.

[29]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[30]  Johanna D. Moore,et al.  INLG 2000 - Proceedings of the First International Natural Language Generation Conference, June 12-16, 2000, Mitzpe Ramon, Israel , 2000, INLG.

[31]  Nils J. Nilsson,et al.  Artificial Intelligence , 1974, IFIP Congress.

[32]  P. Donnan,et al.  Cost effectiveness of computer tailored and non-tailored smoking cessation letters in general practice: randomised controlled trial , 2001, BMJ : British Medical Journal.

[33]  K. Dickersin,et al.  Factors influencing publication of research results. Follow-up of applications submitted to two institutional review boards. , 1992, JAMA.

[34]  Chris Mellish,et al.  Evaluation in the context of natural language generation , 1998, Comput. Speech Lang..

[35]  Barbara Di Eugenio,et al.  The DIAG experiments: Natural Language Generation for Intelligent Tutoring Systems , 2002, INLG.

[36]  Ellen Riloff,et al.  An Empirical Study of Automated Dictionary Construction for Information Extraction in Three Domains , 1996, Artif. Intell..

[37]  Ehud Reiter Pipelines and Size Constraints , 2000, Computational Linguistics.

[38]  Ehud Reiter Shallow vs. Deep Techniques for Handling Linguistic Constraints and Optimisations , 1999 .

[39]  Drew McDermott,et al.  Artificial intelligence meets natural stupidity , 1976, SGAR.