Evaluating the effectiveness of features and sampling in extractive meeting summarization

Feature-based approaches are widely used in the task of extractive meeting summarization. In this paper, we analyze and evaluate the effectiveness of different types of features using forward feature selection in an SVM classifier. In addition to features used in prior studies, we introduce topic related features and demonstrate that these features are helpful for meeting summarization. We also propose a new way to resample the sentences based on their salience scores for model training and testing. The experimental results on both the human transcripts and recognition output, evaluated by the ROUGE summarization metrics, show that feature selection and data resampling help improve the system performance.

[1]  Pascale Fung,et al.  Improving lecture speech summarization using rhetorical information , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).

[2]  Andreas Stolcke,et al.  Using MLP features in SRI's conversational speech recognition system , 2005, INTERSPEECH.

[3]  Gerald Penn,et al.  A Critical Reassessment of Evaluation Baselines for Speech Summarization , 2008, ACL.

[4]  Michel Galley,et al.  A Skip-Chain Conditional Random Field for Ranking Meeting Utterances by Importance , 2006, EMNLP.

[5]  Jean Carletta,et al.  Extractive summarization of meeting recordings , 2005, INTERSPEECH.

[6]  Elizabeth Shriberg,et al.  The ICSI Meeting Recorder Dialog Act (MRDA) Corpus , 2004, SIGDIAL Workshop.

[7]  Julia Hirschberg,et al.  Comparing lexical, acoustic/prosodic, structural and discourse features for speech summarization , 2005, INTERSPEECH.

[8]  Johanna D. Moore,et al.  Evaluating Automatic Summaries of Meeting Recordings , 2005, IEEvaluation@ACL.

[9]  Gerald Penn,et al.  Summarization of spontaneous conversations , 2006, INTERSPEECH.

[10]  Pascale Fung,et al.  Speech Summarization Without Lexical Features for Mandarin Broadcast News , 2007, NAACL.

[11]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[12]  Gerald Penn,et al.  Utterance-Level Extractive Summarization of Open-Domain Spontaneous Conversations with Rich Features , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[13]  Julia Hirschberg,et al.  Summarizing Speech Without Text Using Hidden Markov Models , 2006, NAACL.

[14]  Wessel Kraaij,et al.  Automatic Summarization of Meeting Data: A Feasibility Study , 2005, CLIN.

[15]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[16]  Andreas Stolcke,et al.  The ICSI Meeting Corpus , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[17]  Steve Renals,et al.  Term-Weighting for Summarization of Multi-party Spoken Dialogues , 2007, MLMI.