Online MVDR Beamformer Based on Complex Gaussian Mixture Model With Spatial Prior for Noise Robust ASR
暂无分享,去创建一个
Tomohiro Nakatani | Shoko Araki | Takuya Yoshioka | Marc Delcroix | Takuya Higuchi | Nobutaka Ito | T. Nakatani | N. Ito | T. Higuchi | S. Araki | Marc Delcroix | Takuya Yoshioka
[1] Tomohiro Nakatani,et al. Is speech enhancement pre-processing still relevant when using deep neural networks for acoustic modeling? , 2013, INTERSPEECH.
[2] Jacob Benesty,et al. On Optimal Frequency-Domain Multichannel Linear Filtering for Noise Reduction , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[3] Rémi Gribonval,et al. Under-Determined Reverberant Audio Source Separation Using a Full-Rank Spatial Covariance Model , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[4] Shigeru Katagiri,et al. Cumulative moving averaged bottleneck speaker vectors for online speaker adaptation of CNN-based acoustic models , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Tomohiro Nakatani,et al. Unsupervised discriminative adaptation using differenced maximum mutual information based linear regression , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[6] Tomohiro Nakatani,et al. Speaker-Aware Neural Network Based Beamformer for Speaker Extraction in Speech Mixtures , 2017, INTERSPEECH.
[7] Marc Delcroix,et al. Joint acoustic factor learning for robust deep neural network based automatic speech recognition , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] Lukás Burget,et al. Empirical Evaluation and Combination of Advanced Language Modeling Techniques , 2011, INTERSPEECH.
[9] Jon Barker,et al. The third ‘CHiME’ speech separation and recognition challenge: Dataset, task and baselines , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[10] Scott Rickard,et al. Blind separation of speech mixtures via time-frequency masking , 2004, IEEE Transactions on Signal Processing.
[11] Akihiko Sugiyama,et al. A robust adaptive beamformer for microphone arrays with a blocking matrix using constrained adaptive filters , 1999, IEEE Trans. Signal Process..
[12] Chengzhu Yu,et al. The NTT CHiME-3 system: Advances in speech enhancement and recognition for mobile multi-microphone devices , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[13] Hiroshi Sawada,et al. Underdetermined Convolutive Blind Source Separation via Frequency Bin-Wise Clustering and Permutation Alignment , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[14] Hermann Ney,et al. Improved backing-off for M-gram language modeling , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[15] Masakiyo Fujimoto,et al. Speaker indexing and speech enhancement in real meetings / conversations , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.
[16] Tara N. Sainath,et al. Deep Convolutional Neural Networks for Large-scale Speech Tasks , 2015, Neural Networks.
[17] Dietrich Klakow,et al. Beamforming With a Maximum Negentropy Criterion , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[18] Takuya Yoshioka,et al. Robust MVDR beamforming using time-frequency masks for online/offline ASR in noise , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Hiroshi Sawada,et al. Solving the Permutation Problem of Frequency-Domain BSS when Spatial Aliasing Occurs with Wide Sensor Spacing , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.
[20] Paris Smaragdis,et al. Blind separation of convolved mixtures in the frequency domain , 1998, Neurocomputing.
[21] Mark J. F. Gales,et al. Impact of single-microphone dereverberation on DNN-based meeting transcription systems , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[22] Hiroshi Sawada,et al. A Multichannel MMSE-Based Framework for Speech Source Separation and Noise Reduction , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[23] Michael S. Brandstein,et al. Robust Localization in Reverberant Rooms , 2001, Microphone Arrays.
[24] Gerald Penn,et al. Convolutional Neural Networks for Speech Recognition , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[25] Shoko Araki,et al. Meeting recognition with asynchronous distributed microphone array , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[26] Florian Metze,et al. New Era for Robust Speech Recognition , 2017, Springer International Publishing.
[27] Tomohiro Nakatani,et al. Context adaptive deep neural networks for fast acoustic model adaptation , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[28] Steve Renals,et al. Hybrid acoustic models for distant and multichannel large vocabulary speech recognition , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[29] Shinji Watanabe,et al. Discriminative approach to dynamic variance adaptation for noisy speech recognition , 2011, 2011 Joint Workshop on Hands-free Speech Communication and Microphone Arrays.
[30] Tomohiro Nakatani,et al. Learning speaker representation for neural network based multichannel speaker extraction , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[31] Reinhold Häb-Umbach,et al. Blind speech separation employing directional statistics in an Expectation Maximization framework , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[32] Shinji Watanabe,et al. Variance Compensation for Recognition of Reverberant Speech with Dereverberation Preprocessing , 2011, Robust Speech Recognition of Uncertain or Missing Data.
[33] Atsuo Hiroe,et al. Solution of Permutation Problem in Frequency Domain ICA, Using Multivariate Probability Density Functions , 2006, ICA.
[34] Radoslaw Mazur,et al. An Approach for Solving the Permutation Problem of Convolutive Blind Source Separation Based on Statistical Signal Models , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[35] Daniel P. W. Ellis,et al. An EM Algorithm for Localizing Multiple Sound Sources in Reverberant Environments , 2006, NIPS.
[36] Masakiyo Fujimoto,et al. Speech recognition in the presence of highly non-stationary noise based on spatial, spectral and temporal speech/noise modeling combined with dynamic variance adaptation , 2011 .
[37] Tomohiro Nakatani,et al. Dynamic variance adaptation using differenced maximum mutual information , 2012, MLSLP.
[38] Tomohiro Nakatani,et al. Online environmental adaptation of CNN-based acoustic models using spatial diffuseness features , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[39] Hiroshi Sawada,et al. A robust and precise method for solving the permutation problem of frequency-domain blind source separation , 2004, IEEE Transactions on Speech and Audio Processing.
[40] Takuya Yoshioka,et al. Relaxed disjointness based clustering for joint blind source separation and dereverberation , 2014, 2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC).
[41] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.
[42] Tomohiro Nakatani,et al. Inverse Filtering for Speech Dereverberation Without the Use of Room Acoustics Information , 2010, Speech Dereverberation.
[43] Masakiyo Fujimoto,et al. Strategies for distant speech recognitionin reverberant environments , 2015, EURASIP J. Adv. Signal Process..
[44] Dorothea Kolossa,et al. Missing feature speech recognition in a meeting situation with maximum SNR beamforming , 2008, 2008 IEEE International Symposium on Circuits and Systems.
[45] Rémi Gribonval,et al. Spatial location priors for Gaussian model based reverberant audio source separation , 2013, EURASIP J. Adv. Signal Process..
[46] Qiang Chen,et al. Network In Network , 2013, ICLR.
[47] Shinji Watanabe,et al. Combined static and dynamic variance adaptation for efficient interconnection of speech enhancement pre-processor with speech recognizer , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.
[48] Masakiyo Fujimoto,et al. Low-Latency Real-Time Meeting Recognition and Understanding Using Distant Microphones and Omni-Directional Camera , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[49] Hiroshi Sawada,et al. Underdetermined blind sparse source separation for arbitrarily arranged multiple sensors , 2007, Signal Process..
[50] L. J. Griffiths,et al. An alternative approach to linearly constrained adaptive beamforming , 1982 .
[51] Chengzhu Yu,et al. Context adaptive deep neural networks for fast acoustic model adaptation in noisy conditions , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[52] Tomohiro Nakatani,et al. Spatial correlation model based observation vector clustering and MVDR beamforming for meeting recognition , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[53] Masakiyo Fujimoto,et al. LINEAR PREDICTION-BASED DEREVERBERATION WITH ADVANCED SPEECH ENHANCEMENT AND RECOGNITION TECHNOLOGIES FOR THE REVERB CHALLENGE , 2014 .
[54] Shinji Watanabe,et al. Discriminative feature transforms using differenced maximum mutual information , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[55] O. Hoshuyama,et al. A robust adaptive beamformer for microphone arrays with a blocking matrix using constrained adaptive filters , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[56] Tomohiro Nakatani,et al. Context Adaptive Neural Network for Rapid Adaptation of Deep CNN Based Acoustic Models , 2016, INTERSPEECH.