论文信息 - Sensor Data and Speech

Sensor Data and Speech

This chapter is addressed to readers who work with and are interested in time series of sensor data and in the recognition of spoken language. We consider representations of cases and queries in such formats. We make use of signal processing but do not discuss detailed signal questions in speech. Rather, we are interested in problems concerning use of CBR, such as using signals as representation languages for cases. Both sensor data and speech are regarded as stochastic real-valued processes with an analogue representation. Our view is that this ranges from elementary signal recognition to understanding. This falls into the local-global view used in the whole book. This view was led by a level structure where the levels are concerned with stochastic processes, features, symbols and an overall understanding. The stochastic processes are modified in different ways until features can be extracted. In general, the features are coefficients of certain function representations or combinations of them. Features provide the representation format in which the recorded signals are stored. There are two kinds of similarity measures considered, the feature one and the symbolic one. The feature measure has access to the features. For the symbolic measure, semantics is needed because it refers to the meaning of the words. In speech, simple methods use linear prediction as a mathematical method. Widely used features are the MFCC feature that is not easy to compute. These features are obtained for the analysis of the vocal tract. Some applications are shown that lead to a discussion of the error concept. The discussions are concerned with applications. General knowledge about Part I is required and a look at Chaps. 17 and 18 is recommended.

Michael M. Richter | Rosina O. Weber

[1] Yuriy Romanyshyn,et al. Comparison of similarity measures of speech signals fragments , 2010, 2010 Proceedings of VIth International Conference on Perspective Technologies and Methods in MEMS Design.

[2] James F. Allen. Maintaining knowledge about temporal intervals , 1983, CACM.

[3] Boaz Porat,et al. A course in digital signal processing , 1996 .

[4] Steve Young,et al. The HTK book version 3.4 , 2006 .

[5] Joan Claudi Socoró,et al. Mixing HMM-Based Spanish Speech Synthesis with a CBR for Prosody Estimation , 2007, NOLISP.

[6] Thomas Quatieri,et al. Discrete-Time Speech Signal Processing: Principles and Practice , 2001 .

[7] Nikos Fakotakis,et al. Comparative Evaluation of Various MFCC Implementations on the Speaker Verification Task , 2007 .

[8] Biing-Hwang Juang,et al. Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[9] Luigi Portinale,et al. Extending the JColibri Open Source Architecture for Managing High-Dimensional Data and Large Case Bases , 2009, 2009 21st IEEE International Conference on Tools with Artificial Intelligence.

[10] A. A. Azeta,et al. A Case-Based Reasoning approach for speech-enabled e-Learning system , 2009, 2009 2nd International Conference on Adaptive Science & Technology (ICAST).

[11] Claudio Becchetti,et al. Speech Recognition: Theory and C++ Implementation , 1999 .

[12] G. Bachman,et al. Fourier and Wavelet Analysis , 2002 .

[13] John H. L. Hansen,et al. Discrete-Time Processing of Speech Signals , 1993 .

[14] Roger K. Moore,et al. The case for case-based automatic speech recognition , 2009, INTERSPEECH.

[15] Peter Funk,et al. Extracting Knowledge from Sensor Signals for Case-Based Reasoning with Longitudinal Time Series Data , 2008, Case-Based Reasoning on Images and Signals.

[16] Peter Funk,et al. Concise case indexing of time series in health care by means of key sequence discovery , 2007, Applied Intelligence.

[17] Juan Carlos,et al. Review of "Discrete-Time Speech Signal Processing - Principles and Practice", by Thomas Quatieri, Prentice-Hall, 2001 , 2003 .