Connectionist multivariate density-estimation and its application to speech synthesis
暂无分享,去创建一个
[1] Hideki Kawahara,et al. Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds , 1999, Speech Commun..
[2] David J. C. MacKay,et al. Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.
[3] Yannis Stylianou,et al. Evaluating the intelligibility benefit of speech modifications in known noise conditions , 2013, Speech Commun..
[4] Geoffrey E. Hinton,et al. Modeling pixel means and covariances using factorized third-order boltzmann machines , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[5] Ruslan Salakhutdinov,et al. Importance Weighted Autoencoders , 2015, ICLR.
[6] Heiga Zen,et al. The HMM-based speech synthesis system (HTS) version 2.0 , 2007, SSW.
[7] Daan Wierstra,et al. Deep AutoRegressive Networks , 2013, ICML.
[8] Brendan J. Frey,et al. Graphical Models for Machine Learning and Digital Communication , 1998 .
[9] C. D. Kemp,et al. Density Estimation for Statistics and Data Analysis , 1987 .
[10] Jasper Snoek,et al. Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.
[11] Pascal Vincent,et al. Generalized Denoising Auto-Encoders as Generative Models , 2013, NIPS.
[12] Heiga Zen,et al. An introduction of trajectory model into HMM-based speech synthesis , 2004, SSW.
[13] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.
[14] Christopher M. Bishop,et al. GTM: The Generative Topographic Mapping , 1998, Neural Computation.
[15] Donald Geman,et al. Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[16] Yee Whye Teh,et al. Mixed Cumulative Distribution Networks , 2010, AISTATS.
[17] Zoubin Ghahramani,et al. Proceedings of the 24th international conference on Machine learning , 2007, ICML 2007.
[18] Yoshua Bengio,et al. Maxout Networks , 2013, ICML.
[19] David J. C. MacKay,et al. Bayesian methods for supervised neural networks , 1998 .
[20] Tapani Raiko,et al. Improved Learning of Gaussian-Bernoulli Restricted Boltzmann Machines , 2011, ICANN.
[21] Geoffrey E. Hinton,et al. A New Learning Algorithm for Mean Field Boltzmann Machines , 2002, ICANN.
[22] Constantin F. Aliferis,et al. Causal Feature Selection , 2007 .
[23] Yair Weiss,et al. From learning models of natural image patches to whole image restoration , 2011, 2011 International Conference on Computer Vision.
[24] Geoffrey E. Hinton,et al. The Helmholtz Machine , 1995, Neural Computation.
[25] Shigeo Abe DrEng. Pattern Classification , 2001, Springer London.
[26] Michael I. Jordan,et al. Hidden Markov Decision Trees , 1996, NIPS.
[27] Kevin P. Murphy,et al. Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.
[28] Simon King,et al. Measuring the perceptual effects of modelling assumptions in speech synthesis using stimuli constructed from repeated natural speech , 2014, INTERSPEECH.
[29] Sham M. Kakade,et al. Learning mixtures of spherical gaussians: moment methods and spectral decompositions , 2012, ITCS '13.
[30] Keiichi Tokuda,et al. Speech parameter generation algorithms for HMM-based speech synthesis , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[31] Yoshua Bengio. Discussion of "The Neural Autoregressive Distribution Estimator" , 2011, AISTATS.
[32] Brendan J. Frey,et al. Does the Wake-sleep Algorithm Produce Good Density Estimators? , 1995, NIPS.
[33] Matthias Bethge,et al. In All Likelihood, Deep Belief Is Not Enough , 2010, J. Mach. Learn. Res..
[34] Frank K. Soong,et al. On the training aspects of Deep Neural Network (DNN) for parametric TTS synthesis , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[35] Ruslan Salakhutdinov,et al. On the quantitative analysis of deep belief networks , 2008, ICML '08.
[36] C. N. Liu,et al. Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.
[37] Tijmen Tieleman,et al. Training restricted Boltzmann machines using approximations to the likelihood gradient , 2008, ICML '08.
[38] Judea Pearl,et al. Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.
[39] Tapani Raiko,et al. Iterative Neural Autoregressive Distribution Estimator NADE-k , 2014, NIPS.
[40] Fabrizio Durante,et al. Copula Theory and Its Applications , 2010 .
[41] Haifeng Li,et al. Sequence error (SE) minimization training of neural network for voice conversion , 2014, INTERSPEECH.
[42] Ben Taskar,et al. Learning structured prediction models: a large margin approach , 2005, ICML.
[43] Geoffrey E. Hinton,et al. The EM algorithm for mixtures of factor analyzers , 1996 .
[44] Yann LeCun,et al. Learning Representations by Maximizing Compression , 2011, ArXiv.
[45] Yoshua Bengio,et al. Deep Generative Stochastic Networks Trainable by Backprop , 2013, ICML.
[46] Samy Bengio,et al. Modeling High-Dimensional Discrete Data with Multi-Layer Neural Networks , 1999, NIPS.
[47] Hugo Larochelle,et al. The Neural Autoregressive Distribution Estimator , 2011, AISTATS.
[48] L. M. M.-T.. Theory of Probability , 1929, Nature.
[49] Cassia Valentini-Botinhao,et al. Modelling acoustic feature dependencies with artificial neural networks: Trajectory-RNADE , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[50] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[51] Dorothy T. Thayer,et al. EM algorithms for ML factor analysis , 1982 .
[52] Tom M. Mitchell,et al. The Need for Biases in Learning Generalizations , 2007 .
[53] Daan Wierstra,et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.
[54] M. F.,et al. Bibliography , 1985, Experimental Gerontology.
[55] Cosma Rohilla Shalizi,et al. Philosophy and the practice of Bayesian statistics. , 2010, The British journal of mathematical and statistical psychology.
[56] Jonathan G. Fiscus,et al. Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .
[57] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[58] Yair Weiss,et al. "Natural Images, Gaussian Mixtures and Dead Leaves" , 2012, NIPS.
[59] Hugo Larochelle,et al. A Neural Autoregressive Topic Model , 2012, NIPS.
[60] T. Robinson. Simple Lossless and Near-lossless Waveform Compression , 1994 .
[61] Jitendra Malik,et al. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.
[62] Karol Gregor,et al. Neural Variational Inference and Learning in Belief Networks , 2014, ICML.
[63] Keiichi Tokuda,et al. A Speech Parameter Generation Algorithm Considering Global Variance for HMM-Based Speech Synthesis , 2007, IEICE Trans. Inf. Syst..
[64] S. Srihari. Mixture Density Networks , 1994 .
[65] Geoffrey E. Hinton,et al. Exponential Family Harmoniums with an Application to Information Retrieval , 2004, NIPS.
[66] D. Mackay,et al. Bayesian neural networks and density networks , 1995 .
[67] Alan W. Black,et al. Unit selection in a concatenative speech synthesis system using a large speech database , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[68] Geoffrey E. Hinton,et al. Deep Belief Networks for phone recognition , 2009 .
[69] Geoffrey E. Hinton,et al. Bayesian Learning for Neural Networks , 1995 .
[70] Geoffrey E. Hinton,et al. Deep Boltzmann Machines , 2009, AISTATS.
[71] Yoshua Bengio,et al. Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.
[72] R. Koenker,et al. Regression Quantiles , 2007 .
[73] Yoshua Bengio,et al. Reweighted Wake-Sleep , 2014, ICLR.
[74] Ruslan Salakhutdinov,et al. Evaluating probabilities under high-dimensional latent variable models , 2008, NIPS.
[75] Geoffrey E. Hinton,et al. The "wake-sleep" algorithm for unsupervised neural networks. , 1995, Science.
[76] Heiga Zen,et al. Statistical parametric speech synthesis using deep neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[77] Heiga Zen,et al. Reformulating the HMM as a trajectory model by imposing explicit relationships between static and dynamic feature vector sequences , 2007, Comput. Speech Lang..
[78] Geoffrey E. Hinton,et al. Varieties of Helmholtz Machine , 1996, Neural Networks.
[79] Prabhat,et al. Scalable Bayesian Optimization Using Deep Neural Networks , 2015, ICML.
[80] Tara N. Sainath,et al. Improving deep neural networks for LVCSR using rectified linear units and dropout , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[81] Geoffrey E. Hinton,et al. A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..
[82] R. Fisher,et al. On the Mathematical Foundations of Theoretical Statistics , 1922 .
[83] Volker Tresp,et al. Improved Gaussian Mixture Density Estimates Using Bayesian Penalty Terms and Network Averaging , 1995, NIPS.
[84] Li Yao,et al. Multimodal Transitions for Generative Stochastic Networks , 2013, ICLR.
[85] Keiichi Tokuda,et al. The blizzard challenge - 2005: evaluating corpus-based speech synthesis on common datasets , 2005, INTERSPEECH.
[86] J. van Leeuwen,et al. Neural Networks: Tricks of the Trade , 2002, Lecture Notes in Computer Science.
[87] Nir Friedman,et al. Probabilistic Graphical Models - Principles and Techniques , 2009 .
[88] G. Casella,et al. Statistical Inference , 2003, Encyclopedia of Social Network Analysis and Mining.
[89] Yoshua Bengio,et al. Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..
[90] Ruslan Salakhutdinov,et al. Annealing between distributions by averaging moments , 2013, NIPS.
[91] Nan Wang,et al. An analysis of Gaussian-binary restricted Boltzmann machines for natural images , 2012, ESANN.
[92] Heiga Zen,et al. Autoregressive Models for Statistical Parametric Speech Synthesis , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[93] Padhraic Smyth,et al. Linearly Combining Density Estimators via Stacking , 1999, Machine Learning.
[94] Mark J. F. Gales,et al. Semi-tied covariance matrices for hidden Markov models , 1999, IEEE Trans. Speech Audio Process..
[95] Geoffrey E. Hinton,et al. Deep Mixtures of Factor Analysers , 2012, ICML.
[96] Geoffrey E. Hinton. A Practical Guide to Training Restricted Boltzmann Machines , 2012, Neural Networks: Tricks of the Trade.
[97] Georg Heigold,et al. An empirical study of learning rates in deep neural networks for speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[98] Geoffrey E. Hinton. Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.
[99] Simon King,et al. Modelling the uncertainty in recovering articulation from acoustics , 2003, Comput. Speech Lang..
[100] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[101] James Martens,et al. Deep learning via Hessian-free optimization , 2010, ICML.
[102] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[103] Tom Schaul,et al. No more pesky learning rates , 2012, ICML.
[104] Hugo Larochelle,et al. A Deep and Tractable Density Estimator , 2013, ICML.
[105] Hugo Larochelle,et al. RNADE: The real-valued neural autoregressive density-estimator , 2013, NIPS.
[106] M. Bethge,et al. Mixtures of Conditional Gaussian Scale Mixtures Applied to Multiscale Image Representations , 2011, PloS one.
[107] Klaus-Robert Müller,et al. Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.
[108] Heiga Zen,et al. Statistical Parametric Speech Synthesis , 2007, IEEE International Conference on Acoustics, Speech, and Signal Processing.
[109] Geoffrey E. Hinton,et al. Learning and relearning in Boltzmann machines , 1986 .
[110] Heekuck Oh,et al. Neural Networks for Pattern Recognition , 1993, Adv. Comput..
[111] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[112] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[113] Heiga Zen,et al. Deep mixture density networks for acoustic modeling in statistical parametric speech synthesis , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[114] Heiga Zen,et al. The Effect of Using Normalized Models in Statistical Speech Synthesis , 2011, INTERSPEECH.
[115] Wray L. Buntine. Theory Refinement on Bayesian Networks , 1991, UAI.
[116] Samy Bengio,et al. Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.
[117] Geoffrey E. Hinton,et al. Training Recurrent Neural Networks , 2013 .
[118] P. Spreij. Probability and Measure , 1996 .
[119] Li-Rong Dai,et al. Spectral modeling using neural autoregressive distribution estimators for statistical parametric speech synthesis , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[120] Marie desJardins,et al. Evaluation and selection of biases in machine learning , 1995, Machine Learning.
[121] K. Perez. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment , 2014 .
[122] John Scott Bridle,et al. Probabilistic Interpretation of Feedforward Classification Network Outputs, with Relationships to Statistical Pattern Recognition , 1989, NATO Neurocomputing.
[123] Benigno Uría. A Deep Belief Network for the Acoustic-Articulatory Inversion Mapping Problem , 2011 .
[124] Yoshua Bengio,et al. A Spike and Slab Restricted Boltzmann Machine , 2011, AISTATS.