Deconvolved Conventional Beamforming and Adaptive Cubature Kalman Filter Based Distant Speech Perception System

A spatial-temporal processing framework integrated of speech enhancement and speech tracking is proposed in this paper for distant speech perception. First, weak speech signals are enhanced by the deconvolved conventional beamforming (DCBF) with a microphone array. By virtue of the narrow beamwidth and low sidelobes of the DCBF, the competing sources can be effectively suppressed without introducing extra speech distortion. Second, with the accurate bearing provided by the DCBF, the Cubature Kalman filter can be utilized to track the speech source of interest. By introducing a scaling factor in the current statistical motion model, a new tracking algorithm is proposed which is suitable for both maneuvering and nonmaneuvering speech sources. The introduced scaling factor can be adaptively adjusted to improve the tracking performance of the proposed algorithm for different motion models. Numerical results show that the proposed algorithm can provide better tracking performance than the conventional one. In particular, the tracking root mean square error can be reduced by half for some cases.

[1]  V. Jilkov,et al.  Survey of maneuvering target tracking. Part V. Multiple-model methods , 2005, IEEE Transactions on Aerospace and Electronic Systems.

[2]  T. C. Yang,et al.  Deconvolved Conventional Beamforming for a Horizontal Line Array , 2018, IEEE Journal of Oceanic Engineering.

[3]  S. Y. Chen,et al.  Kalman Filter for Robot Vision: A Survey , 2012, IEEE Transactions on Industrial Electronics.

[4]  F. Daum Nonlinear filters: beyond the Kalman filter , 2005, IEEE Aerospace and Electronic Systems Magazine.

[5]  Bhiksha Raj,et al.  Microphone Array Processing for Distant Speech Recognition: From Close-Talking Microphones to Far-Field Sensors , 2012, IEEE Signal Processing Magazine.

[6]  Thushara D. Abhayapala,et al.  A Gaussian-Sum Based Cubature Kalman Filter for Bearings-Only Tracking , 2013, IEEE Transactions on Aerospace and Electronic Systems.

[7]  Yusuke Hioka,et al.  Optimal Microphone Array Observation for Clear Recording of Distant Sound Sources , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[8]  Emanuel A. P. Habets,et al.  Online Speech Dereverberation Using Kalman Filter and EM Algorithm , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[9]  Elias Nemer,et al.  Single-microphone wind noise reduction by adaptive postfiltering , 2009, 2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[10]  Steve Renals,et al.  Convolutional Neural Networks for Distant Speech Recognition , 2014, IEEE Signal Processing Letters.

[11]  K. S. P. Kumar,et al.  A 'current' statistical model and adaptive algorithm for estimating maneuvering targets , 1984 .

[12]  Richard E. Blahut,et al.  Theory of Remote Image Formation , 2004 .

[13]  Chi Zhang,et al.  Microphone array processing for distance speech capture: A probe study on whisper speech detection , 2010, 2010 Conference Record of the Forty Fourth Asilomar Conference on Signals, Systems and Computers.

[14]  John H. L. Hansen,et al.  Microphone Array Processing Strategies for Distant-Based Automatic Speech Recognition , 2016, IEEE Signal Processing Letters.

[15]  V. G. Reju,et al.  Swarm Intelligence Based Particle Filter for Alternating Talker Localization and Tracking Using Microphone Arrays , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[16]  LI X.RONG,et al.  Survey of maneuvering target tracking. Part I. Dynamic models , 2003 .

[17]  S. Haykin,et al.  Cubature Kalman Filters , 2009, IEEE Transactions on Automatic Control.

[18]  Bhiksha Raj,et al.  Microphone array processing for distant speech recognition: Towards real-world deployment , 2012, Proceedings of The 2012 Asia Pacific Signal and Information Processing Association Annual Summit and Conference.

[19]  Emanuel A. P. Habets,et al.  A Speech Distortion and Interference Rejection Constraint Beamformer , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[20]  T. Kailath,et al.  Spatio-temporal spectral analysis by eigenstructure methods , 1984 .

[21]  M. Omair Ahmad,et al.  An Adaptive Turn Rate Estimation for Tracking a Maneuvering Target , 2020, IEEE Access.

[22]  Smita Sadhu,et al.  Sigma point Kalman filter for bearing only tracking , 2006, Signal Process..

[23]  X. Rong Li,et al.  Hybrid grid multiple-model estimation with application to maneuvering target tracking , 2010, 2010 13th International Conference on Information Fusion.

[24]  Peter Willett,et al.  Multiple-Model Estimators for Tracking Sharply Maneuvering Ground Targets , 2018, IEEE Transactions on Aerospace and Electronic Systems.

[25]  Joseph A. O'Sullivan,et al.  Information-Theoretic Image Formation , 1998, IEEE Trans. Inf. Theory.