Composition of Deep and Spiking Neural Networks for Very Low Bit Rate Speech Coding
暂无分享,去创建一个
Milos Cernak | Afsaneh Asaei | Philip N. Garner | Alexandros Lazaridis | Alexandros Lazaridis | Afsaneh Asaei | M. Cernak
[1] Chin-Hui Lee,et al. Boosting attribute and phone estimation accuracies with deep neural networks for detection-based speech recognition , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Jonathan G. Fiscus,et al. DARPA TIMIT:: acoustic-phonetic continuous speech corpus CD-ROM, NIST speech disc 1-1.1 , 1993 .
[3] Dong Yu,et al. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[4] Rohit Prabhavalkar,et al. Compressing deep neural networks using a rank-constrained topology , 2015, INTERSPEECH.
[5] Dau-Cheng Lyu,et al. Experiments on Cross-Language Attribute Detection and Phone Recognition With Minimal Target-Specific Training Data , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[6] Tara N. Sainath,et al. Structured Transforms for Small-Footprint Deep Learning , 2015, NIPS.
[7] Milos Cernak,et al. A simple continuous excitation model for parametric vocoding , 2015 .
[8] Torsten Dau,et al. Speech Intelligibility Evaluation for Mobile Phones. , 2015 .
[9] Keiichi Tokuda,et al. Speaker adaptation and the evaluation of speaker similarity in the EMIME speech-to-speech translation project , 2010, SSW.
[10] Yoshua. Bengio,et al. Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..
[11] George R. Doddington,et al. A phonetic vocoder , 1989, International Conference on Acoustics, Speech, and Signal Processing,.
[12] Steven Greenberg,et al. LINGUISTIC DISSECTION OF SWITCHBOARD-CORPUS AUTOMATIC SPEECH RECOGNITION SYSTEMS , 2000 .
[13] Liang Lu,et al. Small-Footprint Deep Neural Networks with Highway Connections for Speech Recognition , 2015, INTERSPEECH.
[14] Paul Taylor,et al. Festival Speech Synthesis System , 1998 .
[15] Jesper Jensen,et al. A short-time objective intelligibility measure for time-frequency weighted noisy speech , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[16] Frank K. Soong,et al. On the training aspects of Deep Neural Network (DNN) for parametric TTS synthesis , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Thomas F. Quatieri,et al. Multisensor very lowbit rate speech coding using segment quantization , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.
[18] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[19] Kai Yu,et al. From discontinuous to continuous F0 modelling in HMM-based speech synthesis , 2010, SSW.
[20] Shigeo Morishima,et al. Speech coding based on a multi-layer neural network , 1990, IEEE International Conference on Communications, Including Supercomm Technical Sessions.
[21] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[22] Sarah Eichmann,et al. English Sound Structure , 2016 .
[23] José Mira,et al. Engineering Applications of Bio-Inspired Artificial Neural Networks , 1999, Lecture Notes in Computer Science.
[24] Milos Cernak,et al. On structured sparsity of phonological posteriors for linguistic parsing , 2016, Speech Commun..
[25] João P. Cabral,et al. Using Noisy Speech to Study the Robustness of a Continuous F 0 Modelling Method in HMM-based Speech Synthesis , 2012 .
[26] M. Sahani,et al. Demodulation as Probabilistic Inference , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[27] Heiga Zen,et al. Statistical parametric speech synthesis using deep neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[28] S. Roucos,et al. Segment quantization for very-low-rate speech coding , 1982, ICASSP.
[29] R. Kubichek,et al. Mel-cepstral distance measure for objective speech quality assessment , 1993, Proceedings of IEEE Pacific Rim Conference on Communications Computers and Signal Processing.
[30] Milos Cernak,et al. Speech vocoding for laboratory phonology , 2017, Comput. Speech Lang..
[31] Yang Zhen. Prediction in speech coding: the modification of the coding of LPC parameters and nonlinear estimation technique by using ANN , 1996, Proceedings of Third International Conference on Signal Processing (ICSP'96).
[32] Hao Jiang,et al. A robust 800 bps MBE coder with VQ and MLP , 1998, ICCT'98. 1998 International Conference on Communication Technology. Proceedings (IEEE Cat. No.98EX243).
[33] Vahid Tabataba Vakili,et al. Complexity Reduction of LD-CELP Speech Coding in Prediction of Gain Using Neural Networks , 2009 .
[34] Carla Teixeira Lopes,et al. TIMIT Acoustic-Phonetic Continuous Speech Corpus , 2012 .
[35] S. Dimolitsas,et al. Current objectives in 4-kb/s wireline-quality speech coding standardization , 1994, IEEE Signal Processing Letters.
[36] Hervé Bourlard,et al. Connectionist Speech Recognition: A Hybrid Approach , 1993 .
[37] Bishnu S. Atal,et al. A new model of LPC excitation for producing natural-sounding speech at low bit rates , 1982, ICASSP.
[38] Joon-Hyuk Chang,et al. Packet Loss Concealment Based on Deep Neural Networks for Digital Speech Transmission , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[39] Lizhong Wu,et al. Fully vector-quantized neural network-based code-excited nonlinear predictive speech coding , 1994, IEEE Trans. Speech Audio Process..
[40] Robert M. Gray,et al. Matrix quantizer design for LPC speech using the generalized Llyod algorithm , 1985, IEEE Trans. Acoust. Speech Signal Process..
[41] Bruno Gas,et al. Discriminative coding with predictive neural networks , 1999 .
[42] Geneviève Baudoin,et al. Corpus based very low bit rate speech coding , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..
[43] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.
[44] METHODS FOR SUBJECTIVE DETERMINATION OF TRANSMISSION QUALITY Summary , 2022 .
[45] Manfred R. Schroeder,et al. Code-excited linear prediction(CELP): High-quality speech at very low bit rates , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[46] Milos Cernak,et al. Syllable-based pitch encoding for low bit rate speech coding with recognition/synthesis architecture , 2013, INTERSPEECH.
[47] Richard M. Schwartz,et al. A segment vocoder at 150 b/s , 1983, ICASSP.
[48] Keiichi Tokuda,et al. A very low bit rate speech coder using HMM-based speech recognition/synthesis techniques , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[49] Shaul Markovitch,et al. Anytime Learning of Decision Trees , 2007, J. Mach. Learn. Res..
[50] Masaaki Honda,et al. LPC speech coding based on variable-length segment quantization , 1988, IEEE Trans. Acoust. Speech Signal Process..
[51] Milos Cernak,et al. Neuromorphic based oscillatory device for incremental syllable boundary detection , 2015, INTERSPEECH.
[52] Janet M. Baker,et al. The Design for the Wall Street Journal-based CSR Corpus , 1992, HLT.
[53] Jae S. Lim,et al. Multiband excitation vocoder , 1988, IEEE Transactions on Acoustics, Speech, and Signal Processing.
[54] Richard V. Cox,et al. A very low bit rate speech coder based on a recognition/synthesis paradigm , 2001, IEEE Trans. Speech Audio Process..
[55] Noam Chomsky,et al. The Sound Pattern of English , 1968 .
[56] Ankoor S. Shah,et al. An oscillatory hierarchy controlling neuronal excitability and stimulus processing in the auditory cortex. , 2005, Journal of neurophysiology.
[57] S. Hunt. A nonlinear adaptive predictor for speech compression , 1996, Proceedings of International Conference on Neural Networks (ICNN'96).
[58] Jerry D. Gibson,et al. Speech Compression , 2016, Inf..
[59] Philipos C. Loizou,et al. Speech Quality Assessment , 2011, Multimedia Analysis, Processing and Communications.
[60] Milos Cernak,et al. PhonVoc: A Phonetic and Phonological Vocoding Toolkit , 2016, INTERSPEECH.
[61] Milos Cernak,et al. Incremental Syllable-Context Phonetic Vocoding , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[62] Nicholas D. Lane,et al. DeepEar: robust smartphone audio sensing in unconstrained acoustic environments using deep learning , 2015, UbiComp.
[63] Richard E. Turner,et al. A role for amplitude modulation phase relationships in speech rhythm perception. , 2014, The Journal of the Acoustical Society of America.
[64] D. Wong,et al. Very low data rate speech compression with LPC vector and matrix quantization , 1983, ICASSP.
[65] Geoffrey E. Hinton,et al. Deep Belief Networks for phone recognition , 2009 .
[66] Kai Yu,et al. Continuous F0 Modeling for HMM Based Statistical Parametric Speech Synthesis , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[67] Heiga Zen,et al. The HMM-based speech synthesis system (HTS) version 2.0 , 2007, SSW.
[68] Milos Cernak,et al. Phonological vocoding using artificial neural networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[69] Miroslav Líška,et al. SLOVAK UNIVERSITY OF TECHNOLOGY IN BRATISLAVA , 2010 .
[70] Philip N. Garner,et al. DNN-Based Speech Synthesis: Importance of Input Features and Training Data , 2015, SPECOM.
[71] Alexandre Hyafil,et al. Speech encoding by coupled cortical theta and gamma oscillations , 2015, eLife.
[72] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[73] Marcos Faúndez-Zanuy. Adaptive Hybrid Speech Coding with a MLP/LPC Structure , 1999, IWANN.
[74] Jonathan G. Fiscus,et al. Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .
[75] Biing-Hwang Juang,et al. An 800 bit/s vector quantization LPC vocoder , 1982 .
[76] Jesper Jensen,et al. An Algorithm for Intelligibility Prediction of Time–Frequency Weighted Noisy Speech , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[77] Heiga Zen,et al. Context adaptive training with factorized decision trees for HMM-based statistical parametric speech synthesis , 2011, Speech Commun..
[78] Milos Cernak,et al. On compressibility of neural network phonological features for low bit rate speech coding , 2015, INTERSPEECH.
[79] C. C. Goodyear,et al. A CELP codebook and search technique using a Hopfield net , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.