Evaluating Low-Level Speech Features Against Human Perceptual Data
暂无分享,去创建一个
Aren Jansen | Naomi Feldman | Caitlin Richter | Harini Salgado | A. Jansen | Caitlin Richter | H. Salgado | Naomi H Feldman
[1] Keith Johnson,et al. Resonance in an exemplar-based lexicon: The emergence of social identity and phonology , 2006, J. Phonetics.
[2] William J. Byrne,et al. Acoustic training from heterogeneous data sources: experiments in Mandarin conversational telephone speech transcription , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..
[3] Olli Viikki,et al. Cepstral domain segmental feature vector normalization for noise robust speech recognition , 1998, Speech Commun..
[4] T. M. Nearey. Static, dynamic, and relational properties in vowel perception. , 1989, The Journal of the Acoustical Society of America.
[5] Martin Karafiát,et al. Further investigation into multilingual training and adaptation of stacked bottle-neck neural network structure , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).
[6] J. D. Miller,et al. Auditory-perceptual interpretation of the vowel. , 1989, The Journal of the Acoustical Society of America.
[7] Dave F. Kleinschmidt,et al. Robust speech perception: recognize the familiar, generalize to the similar, and adapt to the novel. , 2015, Psychological review.
[8] Kaori Idemaru,et al. Specificity of dimension-based statistical learning in word recognition. , 2014, Journal of experimental psychology. Human perception and performance.
[9] Benjamin Halberstam,et al. Vowel normalization: the role of fundamental frequency and upper formants , 2004, J. Phonetics.
[10] Keith S. Apfelbaum,et al. Relative cue encoding in the context of sophisticated models of categorization: Separating information from categorization , 2015, Psychonomic bulletin & review.
[11] Hynek Hermansky,et al. Multilingual MLP features for low-resource LVCSR systems , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Fabio Brugnara,et al. Improved automatic speech recognition through speaker normalization , 2006, Comput. Speech Lang..
[13] S. Nittrouer. Age-related differences in perceptual effects of formant transitions within syllables and across syllable boundaries , 1992 .
[14] Jeff A. Bilmes,et al. Unsupervised learning of acoustic features via deep canonical correlation analysis , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Jordan Cohen,et al. Vocal tract normalization in speech recognition: Compensating for systematic speaker variability , 1995 .
[16] B. Repp. Phonetic trading relations and context effects: new experimental evidence for a speech mode of perception. , 1982, Psychological bulletin.
[17] D. Dahan,et al. Talker adaptation in speech perception: Adjusting the signal or the representations? , 2008, Cognition.
[18] Stan Davis,et al. Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .
[19] J. Hillenbrand,et al. Acoustic characteristics of American English vowels. , 1994, The Journal of the Acoustical Society of America.
[20] Philip J. Monahan,et al. Auditory sensitivity to formant ratios: Toward an account of vowel normalisation , 2010, Language and cognitive processes.
[21] Shrikanth S. Narayanan,et al. Effect of spectral normalization on different talker speech recognition by cochlear implant users. , 2008, The Journal of the Acoustical Society of America.
[22] S. Wegmann,et al. Speaker normalization on conversational telephone speech , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[23] Jennifer Cole,et al. Unmasking the acoustic effects of vowel-to-vowel coarticulation: A statistical modeling approach , 2010, J. Phonetics.
[24] W. K. Hastings,et al. Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .
[25] Aren Jansen,et al. Rapid Evaluation of Speech Representations for Spoken Term Discovery , 2011, INTERSPEECH.
[26] Hermann Ney,et al. Acoustic front-end optimization for large vocabulary speech recognition , 1997, EUROSPEECH.
[27] Jessica Maye,et al. Infant sensitivity to distributional information can affect phonetic discrimination , 2002, Cognition.
[28] George Saon,et al. Feature and model space speaker adaptation with full covariance Gaussians , 2006, INTERSPEECH.
[29] B. Lobanov. Classification of Russian Vowels Spoken by Different Speakers , 1971 .
[30] Kaori Idemaru,et al. Word recognition reflects dimension-based statistical learning. , 2011, Journal of experimental psychology. Human perception and performance.
[31] Emily B. Myers,et al. The Perception of Voice Onset Time: An fMRI Investigation of Phonetic Category Structure , 2005, Journal of Cognitive Neuroscience.
[32] S. Molau,et al. Feature space normalization in adverse acoustic conditions , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..
[33] T. M. Nearey. Phonetic feature systems for vowels , 1978 .
[34] Brian Kingsbury,et al. Very deep multilingual convolutional neural networks for LVCSR , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[35] Joshua B. Tenenbaum,et al. Phrase similarity in humans and machines , 2015, CogSci.
[36] S. Blumstein,et al. The effect of subphonetic differences on lexical access , 1994, Cognition.
[37] Sanjeev R. Kulkarni,et al. A Nearest-Neighbor Approach to Estimating Divergence between Continuous Random Vectors , 2006, 2006 IEEE International Symposium on Information Theory.
[38] Adam N Sanborn,et al. Exemplar models as a mechanism for performing Bayesian inference , 2010, Psychonomic bulletin & review.
[39] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[40] Roel Smits,et al. A comparison of vowel normalization procedures for language variation research. , 2004, The Journal of the Acoustical Society of America.
[41] Hynek Hermansky,et al. Evaluation and optimization of perceptually-based ASR front-end , 1993, IEEE Trans. Speech Audio Process..
[42] Hynek Hermansky,et al. Evaluating speech features with the minimal-pair ABX task (II): resistance to noise , 2014, INTERSPEECH.
[43] Reinhold Häb-Umbach. Investigations on inter-speaker variability in the feature space , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).
[44] Naomi H. Feldman,et al. The influence of categories on perception: explaining the perceptual magnet effect as optimal statistical inference. , 2009, Psychological review.
[45] David B. Pisoni,et al. The Nationwide Speech Project: A new corpus of American English dialects , 2006, Speech Commun..
[46] Aren Jansen,et al. Evaluating speech features with the minimal-pair ABX task: analysis of the classical MFC/PLP pipeline , 2013, INTERSPEECH.
[47] B. McMurray,et al. What information is necessary for speech categorization? Harnessing variability in the speech signal by integrating cues computed relative to expectations. , 2011, Psychological review.
[48] G. E. Peterson. Parameters of vowel quality. , 1961, Journal of speech and hearing research.
[49] Mark J. F. Gales,et al. The Application of Hidden Markov Models in Speech Recognition , 2007, Found. Trends Signal Process..
[50] M E Miller,et al. Predicting developmental shifts in perceptual weighting schemes. , 1997, The Journal of the Acoustical Society of America.
[51] R. Jacobs,et al. Perception of speech reflects optimal use of probabilistic speech cues , 2008, Cognition.
[52] Santiago Barreda,et al. Vowel normalization and the perception of speaker changes: an exploration of the contextual tuning hypothesis. , 2012, The Journal of the Acoustical Society of America.
[53] D. Pisoni,et al. Reaction times to comparisons within and across phonetic categories , 1974, Perception & psychophysics.
[54] Yakov Kronrod,et al. A unified account of categorical effects in phonetic perception , 2016, Psychonomic bulletin & review.
[55] Chong Wang,et al. Reading Tea Leaves: How Humans Interpret Topic Models , 2009, NIPS.
[56] Ran Liu,et al. Dimension-based statistical learning of vowels. , 2015, Journal of experimental psychology. Human perception and performance.
[57] H Hermansky,et al. Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.
[58] Aren Jansen,et al. Unsupervised neural network based feature extraction using weak top-down constraints , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[59] Georg Heigold,et al. Multilingual acoustic models using distributed deep neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[60] G. E. Peterson,et al. Control Methods Used in a Study of the Vowels , 1951 .
[61] Marc F Joanisse,et al. Mismatch negativity reflects sensory and phonetic speech processing , 2007, Neuroreport.
[62] A. Lotto,et al. Cue weighting in auditory categorization: implications for first and second language acquisition. , 2006, The Journal of the Acoustical Society of America.
[63] Kaori Idemaru,et al. Individual differences in cue weights are stable across time: the case of Japanese stop lengths. , 2012, The Journal of the Acoustical Society of America.
[64] Joseph C. Toscano,et al. Continuous Perception and Graded Categorization , 2010, Psychological science.
[65] Hynek Hermansky,et al. RASTA processing of speech , 1994, IEEE Trans. Speech Audio Process..