Digesting Virtual "Geek" Culture: The Summarization of Technical Internet Relay Chats

This paper describes a summarization system for technical chats and emails on the Linux kernel. To reflect the complexity and sophistication of the discussions, they are clustered according to subtopic structure on the sub-message level, and immediate responding pairs are identified through machine learning methods. A resulting summary consists of one or more mini-summaries, each on a subtopic from the discussion.

[1]  W. Scacchi,et al.  Free Software Development: Cooperation and Conflict in a Virtual Organizational Culture , 2005 .

[2]  Julia Hirschberg,et al.  Identifying Agreement and Disagreement in Conversational Speech: Use of Bayesian Networks to Model Pragmatic Dependencies , 2004, ACL.

[3]  木村 和夫 Pragmatics , 1997, Language Teaching.

[4]  Mark S. Ackerman,et al.  Reexamining organizational memory , 2000, Commun. ACM.

[5]  Venkata Subramaniam,et al.  Information Retrieval: Data Structures & Algorithms , 1992 .

[6]  Owen Rambow,et al.  Summarizing Email Threads , 2004, NAACL.

[7]  Ricardo Baeza-Yates,et al.  Information Retrieval: Data Structures and Algorithms , 1992 .

[8]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[9]  E. Schegloff,et al.  Opening up Closings , 1973 .

[10]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[11]  Klaus Zechner,et al.  Automatic generation of concise summaries of spoken dialogues in unrestricted domains , 2001, SIGIR '01.

[12]  Stephen Wan,et al.  Generating Overview Summaries of Ongoing Email Thread Discussions , 2004, COLING.

[13]  John Blitzer,et al.  Summarizing archived discussions: a beginning , 2003, IUI '03.

[14]  Marti A. Hearst Multi-Paragraph Segmentation Expository Text , 1994, ACL.

[15]  Joe H. Ward,et al.  Application of an Hierarchical Grouping Procedure to a Problem of Grouping Profiles , 1963 .

[16]  Klaus Ries Segmenting Conversations by Topic, Initiative, and Style , 2001, SIGIR Workshop: Information Retrieval Techniques for Speech Applications.

[17]  Derek Scott Lam,et al.  Exploiting E-mail Structure to Improve Summarization , 2002 .