Enhancement of formant regions in magnitude spectra to develop children's KWS system in zero resource scenario

[1]  Ziyue Wang,et al.  More is Less: Domain-Specific Speech Recognition Microprocessor Using One-Dimensional Convolutional Recurrent Neural Network , 2022, IEEE Transactions on Circuits and Systems I: Regular Papers.

[2]  G. Pradhan,et al.  Data-Adaptive Single-Pole Filtering of Magnitude Spectra for Robust Keyword Spotting , 2022, Circuits, Systems, and Signal Processing.

[3]  Gayadhar Pradhan,et al.  Pitch-robust acoustic feature using single frequency filtering for children's KWS , 2021, Pattern Recognit. Lett..

[4]  G. Pradhan,et al.  An approach for reducing pitch induced mismatches to detect keywords in children’s speech , 2021, Multimedia Tools and Applications.

[5]  Gayadhar Pradhan,et al.  Pitch and noise normalized acoustic feature for children's ASR , 2021, Digit. Signal Process..

[6]  Longxing Shi,et al.  A 22nm, 10.8 μ W/15.1 μ W Dual Computing Modes High Power-Performance-Area Efficiency Domained Background Noise Aware Keyword- Spotting Processor , 2020, IEEE Transactions on Circuits and Systems I: Regular Papers.

[7]  Jyoti Prakash Singh,et al.  A Pitch and Noise Robust Keyword Spotting System Using SMAC Features with Prosody Modification , 2020, Circuits, Systems, and Signal Processing.

[8]  Gayadhar Pradhan,et al.  Significance of Pitch-Based Spectral Normalization for Children's Speech Recognition , 2019, IEEE Signal Processing Letters.

[9]  Gayadhar Pradhan,et al.  Adaptive spectral smoothening for development of robust keyword spotting system , 2019, IET Signal Process..

[10]  S. Shahnawazuddin,et al.  Addressing noise and pitch sensitivity of speech recognition system through variational mode decomposition based spectral smoothing , 2019, Digit. Signal Process..

[11]  S. Shahnawazuddin,et al.  Improving the performance of keyword spotting system for children's speech through prosody modification , 2019, Digit. Signal Process..

[12]  Alka Agrawal,et al.  Voice Biometric: A Technology for Voice Based Authentication , 2018, Advanced Science, Engineering and Medicine.

[13]  Syed Shahnawazuddin,et al.  Assessment of pitch-adaptive front-end signal processing for children's speech recognition , 2018, Comput. Speech Lang..

[14]  S. R. Mahadeva Prasanna,et al.  Development of Multi-Level Speech based Person Authentication System , 2017, J. Signal Process. Syst..

[15]  Diego Giuliani,et al.  Deep-neural network approaches for speech recognition with heterogeneous groups of speakers including children† , 2016, Natural Language Engineering.

[16]  Yongqiang Wang,et al.  An investigation of deep neural networks for noise robust speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[17]  Shrikanth S. Narayanan,et al.  A review of ASR technologies for children's speech , 2009, WOCCI.

[18]  Fabio Brugnara,et al.  Acoustic variability and automatic recognition of children's speech , 2007, Speech Commun..

[19]  Éric Gaussier,et al.  Relation between PLSA and NMF and implications , 2005, SIGIR '05.

[20]  Shrikanth S. Narayanan,et al.  Robust recognition of children's speech , 2003, IEEE Trans. Speech Audio Process..

[21]  Shrikanth S. Narayanan,et al.  Creating conversational interfaces for children , 2002, IEEE Trans. Speech Audio Process..

[22]  Daben Liu,et al.  Speech and language technologies for audio indexing and retrieval , 2000, Proceedings of the IEEE.

[23]  Mark J. F. Gales,et al.  Maximum likelihood linear transformations for HMM-based speech recognition , 1998, Comput. Speech Lang..

[24]  Richard M. Schwartz,et al.  A compact model for speaker-adaptive training , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[25]  Herbert Gish,et al.  A parametric approach to vocal tract length normalization , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[26]  Vassilios Digalakis,et al.  Speaker adaptation using constrained estimation of Gaussian mixtures , 1995, IEEE Trans. Speech Audio Process..

[27]  Philip C. Woodland,et al.  Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models , 1995, Comput. Speech Lang..

[28]  J. Gauvain,et al.  Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains , 1994, IEEE Trans. Speech Audio Process..

[29]  Herman J. M. Steeneken,et al.  Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems , 1993, Speech Commun..

[30]  D Byrd,et al.  Preliminary results on speaker-dependent variation in the TIMIT database. , 1992, The Journal of the Acoustical Society of America.

[31]  Kai-Fu Lee,et al.  On speaker-independent, speaker-dependent, and speaker-adaptive speech recognition , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[32]  H Hermansky,et al.  Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.

[33]  John Makhoul,et al.  LPCW: An LPC vocoder with linear predictive spectral warping , 1976, ICASSP.

[34]  Murat Saraclar,et al.  Generative RNNs for OOV Keyword Search , 2019, IEEE Signal Processing Letters.

[35]  Li Lee,et al.  A frequency warping approach to speaker normalization , 1998, IEEE Trans. Speech Audio Process..

[36]  Norman Fraser,et al.  Voice-based Dialogue in the Real World , 1997 .

[37]  Tao Chen,et al.  Analysis of Speaker Variability , 2022 .