Document expansion for speech retrieval

Methods of document expansion for a speech retrieval document by a recognizer. A database of vectors of automatic transcriptions of documents is accessed and the vectors are truncated by removing all terms that are not recognizable by the recognizer to create truncated vectors. Terms in the vectors are then weighted to associate the truncated vectors with the untruncated vectors. Terms not recognized by the recognizer are then added back to the weighted, truncated vectors. The retrieval effectiveness may then be measured.

[1]  Amit Singhal,et al.  AT&T at TREC-7 , 1998, TREC.

[2]  C. W. Cleverdon,et al.  The testing of index language devices , 1997 .

[3]  Karen Sparck Jones,et al.  Spoken Document Retrieval for TREC-8 at Cambridge University , 1998, TREC.

[4]  C. J. van Rijsbergen,et al.  The use of hierarchic clustering in information retrieval , 1971, Inf. Storage Retr..

[5]  Ellen M. Voorhees,et al.  Overview of the seventh text retrieval conference (trec-7) [on-line] , 1999 .

[6]  Ellen M. Voorhees,et al.  Overview of the Seventh Text REtrieval Conference , 1998 .

[7]  W. Bruce Croft,et al.  An Association Thesaurus for Information Retrieval , 1994, RIAO.

[8]  E. Voorhees The Effectiveness & Efficiency of Agglomerative Hierarchic Clustering in Document Retrieval , 1985 .

[9]  Julia Hirschberg,et al.  AN OVERVIEW OF THE AT&T SPOKEN DOCUMENT RETRIEVAL , 1998 .

[10]  Kazem Taghva,et al.  The Effects of Noisy Data on Text Retrieval , 1994, J. Am. Soc. Inf. Sci..

[11]  Richard A. Harshman,et al.  Indexing by latent semantic indexing , 1990 .

[12]  Re. Techniques for Information Retrieval from Speech Messages , 1991 .

[13]  W. Bruce Croft,et al.  Retrieving documents by plausible inference: An experimental study , 1989, Inf. Process. Manag..

[14]  Peter Willett,et al.  Recent trends in hierarchic document clustering: A critical review , 1988, Inf. Process. Manag..

[15]  Edward A. Fox,et al.  Coefficients of combining concept classes in a collection , 1988, SIGIR '88.

[16]  James Allan,et al.  Automatic Retrieval With Locality Information Using SMART , 1992, TREC.

[17]  SaltonGerard Associative Document Retrieval Techniques Using Bibliographic Information , 1963 .

[18]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[19]  Karen Spärck Jones,et al.  Retrieving spoken documents by combining multiple index sources , 1996, SIGIR '96.

[20]  Paul R. Cohen,et al.  Information retrieval by constrained spreading activation in semantic networks , 1987, Inf. Process. Manag..

[21]  Peter Willett,et al.  Using interdocument similarity information in document retrieval systems , 1997 .

[22]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[23]  Kui-Lam Kwok The use of title and cited titles as document representation for automatic classification , 1975, Inf. Process. Manag..

[24]  Peter Schäuble,et al.  New techniques for open-vocabulary spoken document retrieval , 1998, SIGIR '98.

[25]  Victor Zue,et al.  Subword unit representations for spoken document retrieval , 1997, EUROSPEECH.

[26]  Gerard Salton,et al.  Associative Document Retrieval Techniques Using Bibliographic Information , 1963, JACM.

[27]  Stephen E. Robertson,et al.  Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval , 1994, SIGIR '94.

[28]  Amitabh Kumar Singhal,et al.  Term Weighting Revisited , 1996 .

[29]  Michael E. Lesk,et al.  Computer Evaluation of Indexing and Text Processing , 1968, JACM.

[30]  Michael J. Witbrock,et al.  Using words and phonetic strings for efficient information retrieval from imperfectly transcribed spoken documents , 1997, DL '97.

[31]  Journal of the Association for Computing Machinery , 1961, Nature.

[32]  W. J. Hutchins The concept of “aboutness” in subject indexing , 1997 .

[33]  Kazem Taghva,et al.  The effects of noisy data on text retrieval , 1994 .

[34]  R. E. Jones,et al.  EXPERIMENTS IN INFORMATION RETRIEVAL FROM SPOKEN DOCUMENTS , 1998 .

[35]  Peter Schäuble,et al.  A system for retrieving speech documents , 1992, SIGIR '92.

[36]  Karen Spärck Jones,et al.  Video mail retrieval: the effect of word spotting accuracy on precision , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[37]  Roger M. Needham,et al.  The thesaurus approach to information retrieval , 1958 .

[38]  Karen Sparck Jones Automatic keyword classification for information retrieval , 1971 .

[39]  David Anthony James,et al.  The Application of Classical Informa - tion Retrieval Techniques to Spoken Documents , 1995 .

[40]  R. C. Rose Techniques for information retrieval from speech messages , 1991 .

[41]  Hans-Peter Frei,et al.  Concept based query expansion , 1993, SIGIR.