The Hearsay-II Speech-Understanding System: Integrating Knowledge to Resolve Uncertainty

The Hearsay-H system, developed during the DARPA-sponsored five-year speechunderstanding research program, represents both a specific solution to the speechunderstanding problem and a general framework for coordinating independent processes to achieve cooperative problem-solving behavior. As a computational problem, speech understanding reflects a large number of intrinsically interesting issues. Spoken sounds are achieved by a long chain of successive transformations, from intentions, through semantic and syntactic structuring, to the eventually resulting audible acoustic waves. As a consequence, interpreting speech means effectively inverting these transformations to recover the speaker's intention from the sound. At each step in the interpretive process, ambiguity and uncertainty arise. The Hearsay-II problem-solving framework reconstructs an intention from hypothetical interpretations formulated at various levels of abstraction. In addition, it allocates limited processing resources first to the most promising incremental actions. The final configuration of the Hearsay-II system comprises problem-solving components to generate and evaluate speech hypotheses, and a focus-of-control mechanism to identify potential actions of greatest value. Many of these specific procedures reveal novel approaches to speech problems. Most important, the system successfully integrates and coordinates all of these independent activities to resolve uncertainty and control combinatorics. Several adaptations of the Hearsay-II framework have already been undertaken in other problem domains, and it is anticipated that this trend will continue; many future systems necessarily will integrate diverse sources of knowledge to solve complex problems cooperatively. Discussed in this paper are the characteristics of the speech problem in particular, the special kinds of problem-solving uncertainty in that domain, the structure of the HearsayII system developed to cope with that uncertainty, and the relationship between Hearsay. U's structure and those of other speech-understanding systems. The paper is intended for the general computer science audience and presupposes no speech or artificial intelligence background.

[1]  L. Erman,et al.  Noah-A Bottom-Up Word Hypothesizer for Large-Vocabulary Speech Understanding Systems , 1981, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  D. R. Reddy,et al.  Speech recognition : invited papers presented at the 1974 IEEE symposium , 1975 .

[3]  Lee D. Erman,et al.  A model and a system for machine recognition of speech , 1973 .

[4]  Douglas B. Lenat,et al.  PRINCIPLES OF PATTERN-DIRECTED INFERENCE SYSTEMS , 1978 .

[5]  William A. Woods,et al.  Shortfall and Density Scoring Strategies for Speech Understanding Control , 1977, IJCAI.

[6]  Earl D. Sacerdott Planning in a hierarchy of abstraction spaces , 1973, IJCAI 1973.

[7]  F. Itakura,et al.  Minimum prediction residual principle applied to speech recognition , 1975 .

[8]  Edward M. Riseman,et al.  Levels of Pattern Description in Learning , 1977, IJCAI.

[9]  D.R. Reddy,et al.  Speech recognition by machine: A review , 1976, Proceedings of the IEEE.

[10]  Victor Lesser,et al.  Organization of the Hearsay II speech understanding system , 1975 .

[11]  Larry R. Harris,et al.  The Heuristic Search under Conditions of Error , 1974, Artif. Intell..

[12]  A. Smith,et al.  Word hypothesization in the hearsay II speech system , 1976, ICASSP.

[13]  Franklin S. Cooper,et al.  Speech Understanding Systems , 1976, Artificial Intelligence.

[14]  Edward H. Shortliffe,et al.  A model of inexact reasoning in medicine , 1990 .

[15]  Steven Michael Rubin,et al.  The argos image understanding system. , 1978 .