论文信息 - Shallow and deep learning for audio and natural language processing - 字舞流文

Shallow and deep learning for audio and natural language processing

Po-Sen Huang | Po-Sen Huang

[1] Jaana Kekäläinen,et al. IR evaluation methods for retrieving highly relevant documents , 2000, SIGIR Forum.

[2] Thomas Hofmann,et al. Probabilistic Latent Semantic Indexing , 1999, SIGIR Forum.

[3] Haim Avron,et al. High-Performance Kernel Machines With Implicit Distributed Optimization and Randomization , 2014, Technometrics.

[4] Vikas Sindhwani,et al. Quasi-Monte Carlo Feature Maps for Shift-Invariant Kernels , 2014, J. Mach. Learn. Res..

[5] Joan Bruna,et al. Source separation with scattering Non-Negative Matrix Factorization , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6] Geoffrey Zweig,et al. From captions to visual concepts and back , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7] Li-Rong Dai,et al. A Regression Approach to Speech Enhancement Based on Deep Neural Networks , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[8] Inderjit S. Dhillon,et al. Fast Prediction for Large-Scale Kernel Machines , 2014, NIPS.

[9] DeLiang Wang,et al. On Training Targets for Supervised Speech Separation , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[10] Björn W. Schuller,et al. Discriminatively trained recurrent neural networks for single-channel speech separation , 2014, 2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP).

[11] Brian Kingsbury,et al. How to Scale Up Kernel Methods to Be As Good As Deep Neural Nets , 2014, ArXiv.

[12] Yelong Shen,et al. A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval , 2014, CIKM.

[13] Jianfeng Gao,et al. Deep Learning for Natural Language Processing: Theory and Practice (Tutorial) , 2014 .

[14] Paris Smaragdis,et al. Singing-Voice Separation from Monaural Recordings using Deep Recurrent Neural Networks , 2014, ISMIR.

[15] Jun Du,et al. Deep neural network based speech separation for robust speech recognition , 2014, 2014 12th International Conference on Signal Processing (ICSP).

[16] Dong Yu,et al. Deep Learning: Methods and Applications , 2014, Found. Trends Signal Process..

[17] Jianfeng Gao,et al. Learning Continuous Phrase Representations for Translation Modeling , 2014, ACL.

[18] Christopher Meek,et al. Semantic Parsing for Single-Relation Question Answering , 2014, ACL.

[19] Hui Zhang,et al. Deep stacking networks with time series for speech separation , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[20] Björn W. Schuller,et al. Single-channel speech separation with memory-enhanced recurrent neural networks , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[21] Paris Smaragdis,et al. Deep learning for monaural speech separation , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[22] Jianfeng Gao,et al. Deep Learning for Natural Language Processing and Related Applications (Tutorial at ICASSP) , 2014 .

[23] Yajie Miao,et al. Kaldi+PDNN: Building DNN-based ASR Systems with Kaldi and PDNN , 2014, ArXiv.

[24] Franco Scarselli,et al. On the Complexity of Neural Network Classifiers: A Comparison Between Shallow and Deep Architectures , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[25] Razvan Pascanu,et al. How to Construct Deep Recurrent Neural Networks , 2013, ICLR.

[26] Hakan Erdogan,et al. Deep neural networks for single channel source separation , 2013, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[27] Paris Smaragdis,et al. Experiments on deep learning for speech denoising , 2014, INTERSPEECH.

[28] Joel Z. Leibo,et al. Unsupervised Learning of Invariant Representations in Hierarchical Architectures , 2013, ArXiv.

[29] Larry P. Heck,et al. Learning deep structured semantic models for web search using clickthrough data , 2013, CIKM.

[30] Yoshua Bengio,et al. Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding , 2013, INTERSPEECH.

[31] Dong Yu,et al. Tensor Deep Stacking Networks , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32] Steven C. H. Hoi,et al. MKBoost: A Framework of Multiple Kernel Boosting , 2013, IEEE Transactions on Knowledge and Data Engineering.

[33] Alexander J. Smola,et al. Fastfood: Approximate Kernel Expansions in Loglinear Time , 2014, ArXiv.

[34] Martin J. Wainwright,et al. Divide and Conquer Kernel Ridge Regression , 2013, COLT.

[35] Jianfeng Gao,et al. Deep stacking networks for information retrieval , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[36] DeLiang Wang,et al. Ideal ratio mask estimation using deep neural networks for robust speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[37] Po-Sen Huang,et al. Random features for Kernel Deep Convex Network , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[38] Benjamin Schrauwen,et al. Training and Analysing Deep Recurrent Neural Networks , 2013, NIPS.

[39] Yi-Hsuan Yang,et al. Low-Rank Representation of Both Singing Voice and Music Accompaniment Via Learned Dictionaries , 2013, ISMIR.

[40] Rong Jin,et al. Nyström Method vs Random Fourier Features: A Theoretical and Empirical Comparison , 2012, NIPS.

[41] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[42] Gökhan Tür,et al. Use of kernel deep convex networks and end-to-end learning for spoken language understanding , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).

[43] Yifan Gong,et al. Improving wideband speech recognition using mixed-bandwidth training data in CD-DNN-HMM , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).

[44] Narendra Ahuja,et al. Online learning with kernels: Overcoming the growing sum problem , 2012, 2012 IEEE International Workshop on Machine Learning for Signal Processing.

[45] Grgoire Montavon,et al. Neural Networks: Tricks of the Trade , 2012, Lecture Notes in Computer Science.

[46] Yi-Hsuan Yang,et al. On sparse and low-rank matrix decomposition for singing voice separation , 2012, ACM Multimedia.

[47] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.

[48] Andrew Y. Ng,et al. Semantic Compositionality through Recursive Matrix-Vector Spaces , 2012, EMNLP.

[49] Gökhan Tür,et al. Towards deeper understanding: Deep convex networks for semantic utterance classification , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[50] Paris Smaragdis,et al. Singing-voice separation from monaural recordings using robust principal component analysis , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[51] Dong Yu,et al. Scalable stacking and learning for building deep architectures , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[52] Dong Yu,et al. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[53] Guillermo Sapiro,et al. Real-time Online Singing Voice Separation from Monaural Recordings Using Robust Low-rank Modeling , 2012, ISMIR.

[54] Quoc V. Le,et al. Recurrent Neural Networks for Noise Reduction in Robust ASR , 2012, INTERSPEECH.

[55] Li Deng,et al. Are Sparse Representations Rich Enough for Acoustic Modeling? , 2012, INTERSPEECH.

[56] Yoshua Bengio,et al. Shallow vs. Deep Sum-Product Networks , 2011, NIPS.

[57] Yoshua Bengio,et al. On the Expressive Power of Deep Architectures , 2011, ALT.

[58] Jesper Jensen,et al. An Algorithm for Intelligibility Prediction of Time–Frequency Weighted Noisy Speech , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[59] Dong Yu,et al. Deep Convex Net: A Scalable Architecture for Speech Pattern Classification , 2011, INTERSPEECH.

[60] Jianfeng Gao,et al. Clickthrough-based latent semantic models for web search , 2011, SIGIR.

[61] Yoshua Bengio,et al. Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[62] Tara N. Sainath,et al. Deep Belief Networks using discriminative features for phone recognition , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[63] Brian Kingsbury,et al. Arccosine kernels: Acoustic modeling with infinite neural networks , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[64] Jason Weston,et al. Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[65] Gökhan Tür,et al. What is left to be understood in ATIS? , 2010, 2010 IEEE Spoken Language Technology Workshop.

[66] Jianfeng Gao,et al. Clickthrough-based translation models for web search: from word models to phrase models , 2010, CIKM.

[67] John C. Platt,et al. Translingual Document Representations from Discriminative Projections , 2010, EMNLP.

[68] Ameet Talwalkar,et al. On the Impact of Kernel Approximation on Learning Accuracy , 2010, AISTATS.

[69] Jyh-Shing Roger Jang,et al. On the Improvement of Singing Voice Separation for Monaural Recordings Using the MIR-1K Dataset , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[70] Ameet Talwalkar,et al. Ensemble Nystrom Method , 2009, NIPS.

[71] Lawrence K. Saul,et al. Kernel Methods for Deep Learning , 2009, NIPS.

[72] Christopher J. C. Burges,et al. A machine learning approach for improved BM25 retrieval , 2009, CIKM.

[73] Wei Yuan,et al. Smoothing clickthrough data for web search ranking , 2009, SIGIR.

[74] Geoffrey E. Hinton,et al. Semantic hashing , 2009, Int. J. Approx. Reason..

[75] DeLiang Wang,et al. Time-Frequency Masking for Speech Separation and Its Potential for Hearing Aid Design , 2008 .

[76] Benjamin Recht,et al. Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[77] Rémi Gribonval,et al. Adaptation of Bayesian Models for Single-Channel Source Separation and its Application to Voice/Music Separation in Popular Songs , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[78] Giuseppe Riccardi,et al. Generative and discriminative algorithms for spoken language understanding , 2007, INTERSPEECH.

[79] Yoshua. Bengio,et al. Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[80] Christopher M. Bishop,et al. Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[81] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[82] Rémi Gribonval,et al. Performance measurement in blind audio source separation , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[83] Bhiksha Raj,et al. A Probabilistic Latent Variable Model for Acoustic Modeling , 2006 .

[84] Tong Zhang,et al. Learning Bounds for Kernel Regression Using Effective Data Dimensionality , 2005, Neural Computation.

[85] Gregory N. Hullender,et al. Learning to rank using gradient descent , 2005, ICML.

[86] Petros Drineas,et al. On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning , 2005, J. Mach. Learn. Res..

[87] Scott Rickard,et al. Blind separation of speech mixtures via time-frequency masking , 2004, IEEE Transactions on Signal Processing.

[88] A. Berlinet,et al. Reproducing kernel Hilbert spaces in probability and statistics , 2004 .

[89] Gábor Lugosi,et al. Introduction to Statistical Learning Theory , 2004, Advanced Lectures on Machine Learning.

[90] G. Baudat,et al. Feature vector selection and projection using kernels , 2003, Neurocomputing.

[91] Michael I. Jordan,et al. Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[92] Nello Cristianini,et al. Kernel Methods for Pattern Analysis , 2004 .

[93] Alexander J. Smola,et al. Learning with Kernels: support vector machines, regularization, optimization, and beyond , 2001, Adaptive computation and machine learning series.

[94] Gunnar Rätsch,et al. An introduction to kernel-based learning algorithms , 2001, IEEE Trans. Neural Networks.

[95] Bernhard Schölkopf,et al. Sampling Techniques for Kernel Methods , 2001, NIPS.

[96] Larry P. Heck,et al. Robustness to telephone handset distortion in speaker recognition by discriminative feature design , 2000, Speech Commun..

[97] Nello Cristianini,et al. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[98] Martin D. Buhmann,et al. Radial Basis Functions , 2021, Encyclopedia of Mathematical Geosciences.

[99] Peter L. Bartlett,et al. Neural Network Learning - Theoretical Foundations , 1999 .

[100] H. Sebastian Seung,et al. Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[101] Mark J. F. Gales,et al. Maximum likelihood linear transformations for HMM-based speech recognition , 1998, Comput. Speech Lang..

[102] Vladimir Vapnik,et al. Statistical learning theory , 1998 .

[103] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[104] Bernhard Schölkopf,et al. Improving the accuracy and speed of support vector learning machines , 1997, NIPS 1997.

[105] Michael L. Littman,et al. Automatic Cross-Language Retrieval Using Latent Semantic Indexing , 1997 .

[106] Mitch Weintraub,et al. NONLINEAR DISCRIMINANT FEATURE EXTRACTION FOR ROBUST TEXT-INDEPENDENT SPEAKER RECOGNITION , 1997 .

[107] Bernhard Schölkopf,et al. Improving the Accuracy and Speed of Support Vector Machines , 1996, NIPS.

[108] Jorge Nocedal,et al. A Limited Memory Algorithm for Bound Constrained Optimization , 1995, SIAM J. Sci. Comput..

[109] Christopher M. Bishop,et al. Neural networks for pattern recognition , 1995 .

[110] Ronald J. Williams,et al. Gradient-based learning algorithms for recurrent networks and their computational complexity , 1995 .

[111] Jonathan G. Fiscus,et al. Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .

[112] Paul J. Werbos,et al. Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.

[113] Richard A. Harshman,et al. Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[114] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[115] Michael C. Mozer,et al. A Focused Backpropagation Algorithm for Temporal Pattern Recognition , 1989, Complex Syst..

[116] David Haussler,et al. What Size Net Gives Valid Generalization? , 1989, Neural Computation.

[117] Johan Håstad,et al. Almost optimal lower bounds for small depth circuits , 1986, STOC '86.

[118] David Malah,et al. Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..

[119] S. Boll,et al. Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[120] W. Rudin,et al. Fourier Analysis on Groups. , 1965 .