Large Scale Data Enabled Evolution of Spoken Language Research and Applications
暂无分享,去创建一个
[1] Tom White,et al. Hadoop: The Definitive Guide , 2009 .
[2] S. Chiba,et al. Dynamic programming algorithm optimization for spoken word recognition , 1978 .
[3] Satoshi Nakamura,et al. The ATR Multilingual Speech-to-Speech Translation System , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[4] Kishore Prahallad,et al. Unit size in unit selection speech synthesis , 2003, INTERSPEECH.
[5] James H. Martin,et al. Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2000, Prentice Hall series in artificial intelligence.
[6] Y.K. Muthusamy,et al. Reviewing automatic language identification , 1994, IEEE Signal Processing Magazine.
[7] Joaquín González-Rodríguez,et al. Frame-by-frame language identification in short utterances using deep neural networks , 2015, Neural Networks.
[8] Douglas D. O'Shaughnessy. Speech Communications: Human and Machine , 2012 .
[9] Peter Norvig,et al. The Unreasonable Effectiveness of Data , 2009, IEEE Intelligent Systems.
[10] Hervé Bourlard,et al. Unknown-multiple speaker clustering using HMM , 2002, INTERSPEECH.
[11] Douglas A. Reynolds,et al. A study of new approaches to speaker diarization , 2009, INTERSPEECH.
[12] David A. van Leeuwen,et al. Improved speaker recognition when using i-vectors from multiple speech sources , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[13] Joaquín González-Rodríguez,et al. Automatic language identification using deep neural networks , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Marc A. Zissman,et al. Automatic language identification , 2001, Speech Commun..
[15] Michael Picheny,et al. Statistical natural language generation for speech-to-speech machine translation systems , 2002, INTERSPEECH.
[16] Marc A. Zissman,et al. Automatic language identification of telephone speech messages using phoneme recognition and N-gram modeling , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.
[17] Nicholas W. D. Evans,et al. Speaker Diarization: A Review of Recent Research , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[18] Simon Haykin,et al. Neural Networks: A Comprehensive Foundation , 1998 .
[19] Marc Ferras,et al. Speaker diarization and linking of large corpora , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).
[20] Fernando Pereira,et al. Distributed acoustic modeling with back-off n-grams , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] Keiichi Tokuda,et al. An analysis of machine translation and speech synthesis in speech-to-speech translation system , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[22] Heiga Zen,et al. Statistical parametric speech synthesis using deep neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[23] Haizhou Li,et al. The Asian network-based speech-to-speech translation system , 2009, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding.
[24] Stan Davis,et al. Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .
[25] Olivier Siohan,et al. A big data approach to acoustic model training corpus selection , 2014, INTERSPEECH.
[26] Douglas A. Reynolds,et al. Approaches to language identification using Gaussian mixture models and shifted delta cepstral features , 2002, INTERSPEECH.
[27] Richard P. Lippmann,et al. An introduction to computing with neural nets , 1987 .
[28] Geoffrey E. Hinton,et al. Acoustic Modeling Using Deep Belief Networks , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[29] Kishore Prahallad,et al. A multilingual screen reader in Indian languages , 2010, 2010 National Conference On Communications (NCC).
[30] Yoshua Bengio,et al. Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.
[31] Nikki Mirghafori,et al. Nuts and Flakes: a Study of Data Characteristics in Speaker Diarization , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.
[32] Jean-François Bonastre,et al. Step-by-step and integrated approaches in broadcast news speaker diarization , 2006, Comput. Speech Lang..
[33] Douglas A. Reynolds,et al. Language Recognition via i-vectors and Dimensionality Reduction , 2011, INTERSPEECH.
[34] Biing-Hwang Juang,et al. Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.
[35] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.
[36] Bowen Zhou,et al. IBM MASTOR SYSTEM: Multilingual Automatic Speech-to-Speech Translator , 2006 .
[37] Hema A. Murthy,et al. Natural sounding TTS based on syllable-like units , 2006, 2006 14th European Signal Processing Conference.
[38] Robert L. Mercer,et al. Class-Based n-gram Models of Natural Language , 1992, CL.
[39] Lukás Burget,et al. Language Recognition in iVectors Space , 2011, INTERSPEECH.
[40] Douglas A. Reynolds,et al. An overview of automatic speaker recognition technology , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[41] Thierry Dutoit,et al. A comparative study of pitch extraction algorithms on a large variety of singing sounds , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[42] Constantine Kotropoulos,et al. Speaker segmentation and clustering , 2008, Signal Process..
[43] Douglas A. Reynolds,et al. Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..
[44] Marc A. Zissman,et al. Comparison of : Four Approaches to Automatic Language Identification of Telephone Speech , 2004 .
[45] Fernando Pereira,et al. Distributed acoustic modeling with back-off n-grams , 2012, ICASSP.
[46] David Gerhard,et al. Pitch Extraction and Fundamental Frequency: History and Current Techniques , 2003 .
[47] R. Redner,et al. Mixture densities, maximum likelihood, and the EM algorithm , 1984 .
[48] David A. van Leeuwen,et al. Large-Scale Speaker Diarization for Long Recordings and Small Collections , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[49] B. Yegnanarayana,et al. Artificial Neural Networks , 2004 .
[50] Krzysztof Marasek,et al. SPEECON – Speech Databases for Consumer Devices: Database Specification and Validation , 2002, LREC.
[51] Yoshua Bengio,et al. Hierarchical Probabilistic Neural Network Language Model , 2005, AISTATS.
[52] Mark Johnson,et al. How the Statistical Revolution Changes (Computational) Linguistics , 2009 .
[53] Katrin Kirchhoff. Chapter 2 – Language Characteristics , 2006 .
[54] Bernard Mérialdo,et al. A Dynamic Language Model for Speech Recognition , 1991, HLT.
[55] Haizhou Li,et al. An overview of text-independent speaker recognition: From features to supervectors , 2010, Speech Commun..
[56] Heiga Zen,et al. Statistical Parametric Speech Synthesis , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[57] Simon King,et al. Multisyn: Open-domain unit selection for the Festival speech synthesis system , 2007, Speech Commun..
[58] Oliver Schreer,et al. Diarizing large corpora using multi-modal speaker linking , 2014, INTERSPEECH.
[59] Lukás Burget,et al. Strategies for training large scale neural network language models , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.
[60] Heiga Zen,et al. Directly modeling speech waveforms by neural networks for statistical parametric speech synthesis , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[61] Alan W. Black,et al. Limited domain synthesis , 2000, INTERSPEECH.
[62] Björn Schuller,et al. Opensmile: the munich versatile and fast open-source audio feature extractor , 2010, ACM Multimedia.
[63] Preethi Jyothi,et al. Large-scale discriminative language model reranking for voice-search , 2012, WLM@NAACL-HLT.
[64] Leena Mary. Automatic Extraction of Prosody for Speaker, Language and Speech Recognition , 2012 .
[65] Brian Roark,et al. Discriminative Language Modeling with Conditional Random Fields and the Perceptron Algorithm , 2004, ACL.
[66] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[67] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.
[68] Haizhou Li,et al. Language Identification: A Tutorial , 2011, IEEE Circuits and Systems Magazine.
[69] Mohammad Hossein Moattar,et al. A review on speaker diarization systems and approaches , 2012, Speech Commun..
[70] Bayya Yegnanarayana,et al. Extraction and representation of prosodic features for language and speaker recognition , 2008, Speech Commun..
[71] Sanjeev Khudanpur,et al. Efficient Subsampling for Training Complex Language Models , 2011, EMNLP.
[72] Wei Zhang,et al. The IBM speech-to-speech translation system for smartphone: Improvements for resource-constrained tasks , 2013, Comput. Speech Lang..
[73] Woojay Jeon,et al. Efficient speaker search over large populations using kernelized locality-sensitive hashing , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[74] Frederick Jelinek,et al. Structured language modeling , 2000, Comput. Speech Lang..
[75] Wang Lirong,et al. Articulatory Speech Synthesis: A Survey , 2011, 2011 14th IEEE International Conference on Computational Science and Engineering.
[76] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.
[77] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.
[78] Hema A. Murthy,et al. Methods for improving the quality of syllable based speech synthesis , 2008, 2008 IEEE Spoken Language Technology Workshop.
[79] Joshua Goodman,et al. Classes for fast maximum entropy training , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[80] Stanley F. Chen,et al. An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.
[81] H Bung. Automatic speech recognition and understanding : A first step toward natural human-machine communication , 2000 .
[82] Vijay V. Raghavan,et al. Big Data: Promises and Problems , 2015, Computer.
[83] Rohit Prasad,et al. Batch-mode semi-supervised active learning for statistical machine translation , 2013, Comput. Speech Lang..
[84] Vijay V. Raghavan,et al. Big Data Driven Natural Language Processing Research and Applications , 2015 .
[85] William M. Campbell,et al. Support vector machines for speaker and language recognition , 2006, Comput. Speech Lang..
[86] Douglas A. Reynolds,et al. An overview of automatic speaker diarization systems , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[87] Joshua Goodman,et al. A bit of progress in language modeling , 2001, Comput. Speech Lang..
[88] H Hermansky,et al. Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.
[89] Thorsten Brants,et al. One billion word benchmark for measuring progress in statistical language modeling , 2013, INTERSPEECH.
[90] S. Furui,et al. Automatic recognition and understanding of spoken language - a first step toward natural human-machine communication , 2000, Proceedings of the IEEE.
[91] Alan W. Black,et al. Unit selection in a concatenative speech synthesis system using a large speech database , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[92] Christopher J. C. Burges,et al. A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.
[93] Vijendra Raj Apsingekar,et al. Speaker Model Clustering for Efficient Speaker Identification in Large Population Applications , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[94] Bin Ma,et al. Spoken Language Recognition: From Fundamentals to Practice , 2013, Proceedings of the IEEE.
[95] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[96] Piotr Indyk,et al. Approximate Nearest Neighbor: Towards Removing the Curse of Dimensionality , 2012, Theory Comput..
[97] Johan Schalkwyk,et al. Query language modeling for voice search , 2010, 2010 IEEE Spoken Language Technology Workshop.
[98] Yoav Goldberg,et al. A Dataset of Syntactic-Ngrams over Time from a Very Large Corpus of English Books , 2013, *SEMEVAL.
[99] Yoshua Bengio,et al. A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..
[100] Sameeraj Meduri,et al. A survey and evaluation of voice activity detection algorithms: speech processing module , 2012 .
[101] Marijn Huijbregts,et al. The ICSI RT07s Speaker Diarization System , 2007, CLEAR.
[102] Yee Whye Teh,et al. A Hierarchical Bayesian Language Model Based On Pitman-Yor Processes , 2006, ACL.
[103] Panayiotis G. Georgiou,et al. Unsupervised data processing for classifier-based speech translator , 2013, Comput. Speech Lang..
[104] Douglas D. O'Shaughnessy,et al. Invited paper: Automatic speech recognition: History, methods and challenges , 2008, Pattern Recognit..
[105] Douglas A. Reynolds,et al. A Tutorial on Text-Independent Speaker Verification , 2004, EURASIP J. Adv. Signal Process..
[106] Patrick Kenny,et al. Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[107] Ronald W. Schafer,et al. Theory and Applications of Digital Speech Processing , 2010 .
[108] Ronald W. Schafer,et al. Digital Processing of Speech Signals , 1978 .
[109] Tanja Schultz,et al. LVCSR-based language identification , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.