Learning from the Report-writing Behavior of Individuals

We describe a briefing system that learns to predict the contents of reports generated by users who create periodic (weekly) reports as part of their normal activity. The system observes content-selection choices that users make and builds a predictive model that could, for example, be used to generate an initial draft report. Using a feature of the interface the system also collects information about potential user-specific features. The system was evaluated under realistic conditions, by collecting data in a project-based university course where student group leaders were tasked with preparing weekly reports for the benefit of the instructors, using the material from individual student reports. This paper addresses the question of whether data derived from the implicit supervision provided by end-users is robust enough to support not only model parameter tuning but also a form of feature discovery. Results indicate that this is the case: system performance improves based on the feedback from user activity. We find that individual learned models (and features) are user-specific, although not completely idiosyncratic. Thismay suggest that approaches which seek to optimizemodels globally (say over a large corpus of data) may not in fact produce results acceptable to all individuals.

[1]  Hans Peter Luhn,et al.  The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[2]  Wei-Ying Ma,et al.  A Study for Document Summarization Based on Personal Annotation , 2003, HLT-NAACL 2003.

[3]  Kathleen R. McKeown,et al.  Generating natural language summaries from multiple on-line sources , 1998 .

[4]  Noémie Elhadad,et al.  Facilitating Physicians' Access to Information via Tailored Text Summarization , 2005, AMIA.

[5]  Anton Leuski,et al.  iNeATS: Interactive Multi-Document Summarization , 2003, ACL.

[6]  Ted Dunning,et al.  Accurate Methods for the Statistics of Surprise and Coincidence , 1993, CL.

[7]  Hema Raghavan,et al.  Active Learning with Feedback on Features and Instances , 2006, J. Mach. Learn. Res..

[8]  John Blitzer,et al.  Summarizing archived discussions: a beginning , 2003, IUI '03.

[9]  Inderjeet Mani,et al.  Using Summarization for Automatic Briefing Generation , 2000 .

[10]  Ani Nenkova,et al.  Evaluating Content Selection in Summarization: The Pyramid Method , 2004, NAACL.

[11]  Alexander I. Rudnicky,et al.  Briefing Assistant: Learning Human Summarization Behavior over Time , 2005, AAAI Spring Symposium: Persistent Assistants: Living and Working with AI.

[12]  Klaus Zechner,et al.  Automatic generation of concise summaries of spoken dialogues in unrestricted domains , 2001, SIGIR '01.

[13]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[14]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[15]  Massih-Reza Amini Interactive Learning for Text Summarization , 2000 .

[16]  R. Jones,et al.  Active Learning with Feedback on Both Features and Instances , 2006 .

[17]  Owen Rambow,et al.  Summarizing Email Threads , 2004, NAACL.

[18]  Liang Zhou,et al.  Digesting Virtual "Geek" Culture: The Summarization of Technical Internet Relay Chats , 2005, ACL.