Towards Multi-modal Extraction and Summarization of Conversations

For many business intelligence applications, decision making depends critically on the information contained in all forms of "informal" text documents, such as emails, meeting summaries, attachments and web documents. For example, in a meeting, the topic of developing a new product was first raised. In subsequent follow-up emails, additional comments and discussions were added, which included links to web documents describing similar products in the market and user reviews on those products. A concise summary of this "conversation" is obviously valuable. However, existing technologies are inadequate in at least two fundamental ways. First, extracting "conversations" embedded in multi-genre documents is very challenging. Second, applying existing multi-document summarization techniques, where were designed mainly for formal documents, have proved to be highly ineffective when applied to informal documents like emails. In this talk, we first review some of the earlier works done on extracting email conversations. We also give an overview of email summarization and meeting summarization methods. We then present several open problems that need to be solved for multi-modal extraction and summarization of conversations to become a reality. Last but not least, we specifically focus on extraction and summarization of sentiments in emails and blogs.