论文信息 - Intelligently Integrating Information from Speech and Vision Processing to Perform Light-weight Meeting Understanding

Intelligently Integrating Information from Speech and Vision Processing to Perform Light-weight Meeting Understanding

Important information is often generated at meetings but identifying, and retrieving that information after the meeting is not always simple. Automatically capturing such information and making it available for later retrieval has therefore become a topic of some interest. Most approaches to this problem have involved constructing specialized instrumented meeting rooms that allow a meeting to be captured in great detail. We propose an alternate approach that focuses on people’s information retrieval needs and makes use of a light-weight data collection system that allows data acquisition on portable equipment, such as personal laptops. Issues that arise include the integration of information from different audio and video streams and optimum use of sparse computing resources. This paper describes our current development of a light-weight portable meeting recording infrastructure, as well as the use of streams of visual and audio information to derive structure from meetings. The goal is to make meeting contents easily accessible to people.

Alexander I. Rudnicky | Satanjeev Banerjee | Paul E. Rybski | Francisco Veloso

[1] Maria L. Gini,et al. Performance of a distributed robotic system using shared communications channels , 2002, IEEE Trans. Robotics Autom..

[2] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[3] Alexander I. Rudnicky,et al. Using simple speech-based features to detect the state of a meeting and the roles of the meeting participants , 2004, INTERSPEECH.

[4] Brett Browning,et al. CAMEO: Camera Assisted Meeting Event Observer , 2007 .

[5] Carolyn Penstein Rosé,et al. The Necessity of a Meeting Recording and Playback System, and the Benefit of Topic-Level Annotations to Meeting Browsing , 2005, INTERACT.

[6] Brett Browning,et al. ÜberSim: a multi-robot simulator for robot soccer , 2003, AAMAS '03.

[7] Maria L. Gini,et al. Resource scheduling and load balancing in distributed robotic control systems , 2003, Robotics Auton. Syst..

[8] Alexander I. Rudnicky,et al. Creating Multi-Modal, User-Centric Records of Meetings with the Carnegie Mellon Meeting Recorder Architecture , 2004 .

[9] Marti A. Hearst. Text Tiling: Segmenting Text into Multi-paragraph Subtopic Passages , 1997, CL.

[10] M. Veloso,et al. Using Sparse Visual Data to Model Human Activities in Meetings , 2004 .