Speech and language processing over the web

This article has provided a brief overview of how the marriage of speech and language technologies with the Web is changing the way people communicate and access information. Advances are happening at a rapid pace through improved algorithms, wider penetration of the Web, availability of data, and faster computing. As the Web continues to evolve, research initiatives need to continue to address difficult challenges in areas of information mining of multimedia data, multimodal search of Web and media contents, speaker recognition for reducing Internet fraud, interactive question/answering for Web-based self-service, Web mining for knowledge discovery, two-way language translation, and Web page personalization and ranking. Progress in these areas will help to generate exciting new business opportunities for mobile Internet, secure voice print, ubiquitous multilingual communication, IPTV, globalization of customer care, and information search of massive amounts of data.

[1]  Doug Downey,et al.  Methods for Domain-Independent Information Extraction from the Web: An Experimental Comparison , 2004, AAAI.

[2]  Ellen M. Voorhees,et al.  Overview of the TREC 2004 Novelty Track. , 2005 .

[3]  Steve J. Young,et al.  Talking to machines (statistically speaking) , 2002, INTERSPEECH.

[4]  Tomek Strzalkowski,et al.  HITIQA: An Interactive Question Answering System: A Preliminary Report , 2003, ACL 2003.

[5]  Jimmy J. Lin,et al.  Question answering from the web using knowledge annotation and knowledge mining techniques , 2003, CIKM '03.

[6]  PROCEssIng magazInE IEEE Signal Processing Magazine , 2004 .

[7]  Paul Over,et al.  TRECVID 2006 Overview , 2006, TRECVID.

[8]  Dilek Z. Hakkani-Tür,et al.  Bootstrapping Language Models for Spoken Dialog Systems From The World Wide Web , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[9]  Deepak Agarwal,et al.  Mining customer care dialogs for "Daily News" , 2005, IEEE Transactions on Speech and Audio Processing.

[10]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[11]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[12]  Dilek Z. Hakkani-Tür,et al.  Webtalk: Towards Automatically Building Spoken Dialog Systems Through Miningwebsites , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[13]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[14]  Robert L. Mercer,et al.  A Statistical Approach to Sense Disambiguation in Machine Translation , 1991, HLT.

[15]  Stephanie Seneff,et al.  Scalable and portable web-based multimodal dialogue interaction with geographical databases , 2006, INTERSPEECH.

[16]  Jaideep Srivastava,et al.  Automatic personalization based on Web usage mining , 2000, CACM.

[17]  Arun Sundararajan,et al.  Opinion Mining using Econometrics: A Case Study on Reputation Systems , 2007, ACL.

[18]  Phil D. Green,et al.  Robust automatic speech recognition with missing and unreliable acoustic data , 2001, Speech Commun..

[19]  Andrej Ljolje,et al.  Automatic Generation of Detailed Pronunciation Lexicons , 1996 .

[20]  Bernard Renger,et al.  A Multimodal Interface for Access to Content in the Home , 2007, ACL.

[21]  Dong Yu,et al.  An introduction to voice search , 2008, IEEE Signal Processing Magazine.

[22]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[23]  Hoa Trang Dang,et al.  Overview of the TREC 2006 Question Answering Track 99 , 2006, TREC.

[24]  Joyce Chai,et al.  Discourse Structure for Context Question Answering , 2004, HLT-NAACL 2004.

[25]  Alexander Franz,et al.  Searching the Web by Voice , 2002, COLING.

[26]  Aaron E. Rosenberg,et al.  General phrase speaker verification using sub-word background models and likelihood-ratio scoring , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[27]  Marilyn A. Walker,et al.  MATCH: An Architecture for Multimodal Dialogue Systems , 2002, ACL.

[28]  Junlan Feng,et al.  A learning approach to discovering Web page semantic structures , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[29]  Richard M. Stern,et al.  Speech recognition in mobile environments , 2000 .

[30]  Andrew Hickl,et al.  Question Answering with LCC's CHAUCER-2 at TREC 2007 , 2006, TREC.

[31]  Alan W. Black,et al.  Unit selection in a concatenative speech synthesis system using a large speech database , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[32]  Timothy J. Hazen,et al.  Retrieval and browsing of spoken content , 2008, IEEE Signal Processing Magazine.

[33]  Vassilios Digalakis,et al.  Robust speech recognition for multiple topological scenarios of the GSM mobile phone system , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[34]  R. Bashirullah,et al.  Technology and Signal Processing for Brain-Machine Interfaces , 2008, IEEE Signal Processing Magazine.