A novel Gaussianized vector representation for natural scene categorization
暂无分享,去创建一个
Thomas S. Huang | Mark Hasegawa-Johnson | Hao Tang | Xiaodan Zhuang | Xi Zhou | Thomas S. Huang | M. Hasegawa-Johnson | Xi Zhou | Xiaodan Zhuang | Hao Tang
[1] Mark Johnson,et al. Multi-vector pitch-orthogonal LPC: quality speech with low complexity at rates between 4 and 8 kbps , 1990, ICSLP.
[2] James W. Beauchamp,et al. Acoustics, Audio, and Music Technology Education at the University of Illinois at Urbana‐Champaign , 2001 .
[3] Mark Hasegawa-Johnson,et al. Analysis of the three-dimensional tongue shape using a three-index factor analysis model. , 2003, The Journal of the Acoustical Society of America.
[4] Jennifer Cole,et al. Sets for the Automatic Detection of Prosodic Prominence , 2010 .
[5] Thomas S. Huang,et al. Emotion recognition from speech VIA boosted Gaussian mixture models , 2009, 2009 IEEE International Conference on Multimedia and Expo.
[6] Mark Hasegawa-Johnson,et al. A Maximum Likelihood Prosody Recognizer , 2004 .
[7] Mark Hasegawa-Johnson,et al. Optimal speech estimator considering room response as well as additive noise: Different approaches in low and high frequency range , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.
[8] M. Hasegawa-Johnson,et al. Bayesian learning for models of human speech perception , 2004, IEEE Workshop on Statistical Signal Processing, 2003.
[9] CTMRedit : A Case Study in Human-Computer Interface Design , 1999 .
[10] Using Web Mining Techniques to Build a Multi-dialect Lexicon of Arabic , .
[11] Mark Hasegawa-Johnson,et al. Landmark-based automated pronunciation error detection , 2010, INTERSPEECH.
[12] Mark Hasegawa-Johnson,et al. PLP coefficients can be quantized at 400 bps , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[13] Mark Hasegawa-Johnson,et al. Frequency of consonant articulation errors in dysarthric speech , 2010, Clinical linguistics & phonetics.
[14] Mark Hasegawa-Johnson,et al. How do ordinary listeners perceive prosodic prominence? Syntagmatic versus paradigmatic comparison. , 2009 .
[15] M. Johnson,et al. Pitch sharpening for perceptually improved CELP, and the sparse-delta codebook for reduced computation , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.
[16] Stephen E. Levinson,et al. 1 An Empathic-tutoring System Using Spoken Language , 2007 .
[17] Jui Ting Huang,et al. Unsupervised Prosodic Break Detection in Mandarin Speech , 2008 .
[18] Abeer Alwan,et al. Speech Coding: Fundamentals and Applications , 2003 .
[19] Mark Hasegawa-Johnson,et al. Approximately independent factors of speech using nonlinear symplectic transformation , 2003, IEEE Trans. Speech Audio Process..
[20] Yanli Zheng. PARAFAC analysis of the three dimensional tongue shape , 2007 .
[21] Po-Sen Huang,et al. Prosody-dependent acoustic modeling using variable-parameter hidden markov models , 2010 .
[22] Mark Hasegawa-Johnson,et al. Novel entropy based moving average refiners for HMM landmarks , 2006, INTERSPEECH.
[23] Jeung-Yoon Choi,et al. Finding intonational boundaries using acoustic cues related to the voice source. , 2005, The Journal of the Acoustical Society of America.
[24] Mark Hasegawa-Johnson,et al. How Unlabeled Data Change the Acoustic Models For Phonetic Classification , 2010 .
[25] Mark Hasegawa-Johnson,et al. FSM-based pronunciation modeling using articulatory phonological code , 2010, INTERSPEECH.
[26] Thomas S. Huang,et al. Kernel metric learning for phonetic classification , 2009, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding.
[27] Jeung-Yoon Choi,et al. Simultaneous recognition of words and prosody in the Boston University Radio Speech Corpus , 2005, Speech Commun..
[28] Mark Hasegawa-Johnson,et al. Acoustic Differentiation of ip and IP Boundary Levels: Comparison of L- and L-L% in the Switchboard Corpus , 2004 .
[29] Ken Chen,et al. Speech Recognition Models of the Interdependence Among Syntax, Prosody, and Segmental Acoustics , 2004, HLT-NAACL 2004.
[30] Automated Pronunciation Scoring for L2 English Learners , 2008 .
[31] Mark Hasegawa-Johnson,et al. ON THE EDGE: ACOUSTIC CUES TO LAYERED PROSODIC DOMAINS , 2007 .
[32] Yun Fu,et al. Lipreading by Locality Discriminant Graph , 2007, 2007 IEEE International Conference on Image Processing.
[33] Mark Hasegawa-Johnson,et al. Distinctive feature based SVM discriminant features for improvements to phone recognition on telephone band speech , 2005, INTERSPEECH.
[34] Stephen E. Levinson,et al. A Hybrid Model for Spontaneous Speech Understanding , 2005 .
[35] M. Hasegawa-Johnson,et al. CTMRedit: a Matlab-based tool for segmenting and interpolating MRI and CT images in three orthogonal planes , 1999, Proceedings of the First Joint BMES/EMBS Conference. 1999 IEEE Engineering in Medicine and Biology 21st Annual Conference and the 1999 Annual Fall Meeting of the Biomedical Engineering Society (Cat. N.
[36] Mark Hasegawa-Johnson,et al. Multivariate-state hidden Markov models for simultaneous transcription of phones and formants , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[37] Zhihong Zeng,et al. Sensitive Talking Heads , 2009 .
[38] Abeer Alwan,et al. Vowel category dependence of the relationship between palate height, tongue height, and oral area. , 2003, Journal of speech, language, and hearing research : JSLHR.
[39] Ming Liu,et al. Robust Analysis and Weighting on MFCC Components for Speech Recognition and Speaker Identification , 2007, 2007 IEEE International Conference on Multimedia and Expo.
[40] M. Hasegawa-Johnson,et al. Particle filtering approach to Bayesian formant tracking , 2003, IEEE Workshop on Statistical Signal Processing, 2003.
[41] Thomas S. Huang,et al. Real-world acoustic event detection , 2010, Pattern Recognit. Lett..
[42] Mark Hasegawa-Johnson,et al. Prosodic effects on acoustic cues to stop voicing and place of articulation: Evidence from Radio News speech , 2007, J. Phonetics.
[43] Pietro Perona,et al. A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).
[44] Mark Hasegawa-Johnson,et al. Prosodic effects on temporal structure of monosyllabic CVC words in American English , 2010 .
[45] Tomohiko Taniguchi,et al. Speech coding system having codebook storing differential vectors between each two adjoining code vectors , 1995 .
[46] Wen Gao,et al. An improved active shape model for face alignment , 2002, Proceedings. Fourth IEEE International Conference on Multimodal Interfaces.
[47] Barbara Caputo,et al. Recognition with local features: the kernel recipe , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.
[48] Mark Hasegawa-Johnson,et al. NON-LINEAR INDEPENDENT COMPONENT ANALYSIS FOR SPEECH RECOGNITION , 2003 .
[49] Mark Hasegawa-Johnson,et al. Automated pronunciation scoring using confidence scoring and landmark-based SVM , 2009, INTERSPEECH.
[50] Thomas S. Huang,et al. Novel Gaussianized vector representation for improved natural scene categorization , 2010, Pattern Recognit. Lett..
[51] Mark Hasegawa-Johnson,et al. A Factored Language Model for Prosody Dependent Speech Recognition , 2007 .
[52] Stephen E. Levinson,et al. Children's emotion recognition in an intelligent tutoring scenario , 2004, INTERSPEECH.
[53] M. Johnson,et al. Low-complexity multi-mode VXC using multi-stage optimization and mode selection (speech coding) , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.
[54] Mark Hasegawa-Johnson,et al. Acoustic fall detection using Gaussian mixture models and GMM supervectors , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[55] Thomas S. Huang,et al. A Novel Vector Representation of Stochastic Signals Based on Adapted Ergodic HMMs , 2010, IEEE Signal Processing Letters.
[56] Mark Hasegawa-Johnson,et al. Signal-based and expectation-based factors in the perception of prosodic prominence , 2010 .
[57] Mark Hasegawa-Johnson,et al. The effect of accent on acoustic cues to stop voicing and place of articulation in radio news speech , 2004, Speech Prosody 2004.
[58] Ken Chen,et al. An evaluation of using mutual information for selection of acoustic-features representation of phonemes for speech recognition , 2002, INTERSPEECH.
[59] Mark Hasegawa-Johnson,et al. Towards Interpretation of Creakiness in Switchboard , 2008 .
[60] Antonio Torralba,et al. Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.
[61] Jennifer Cole,et al. Speaker-Independent Automatic Detection of Pitch Accent , 2004 .
[62] A. Treisman,et al. A feature-integration theory of attention , 1980, Cognitive Psychology.
[63] Bowon Lee,et al. A PHONEMIC RESTORATION APPROACH FOR AUTOMATIC SPEECH RECOGNITION WITH HIGHLY NONSTATIONARY BACKGROUND NOISE , 2022 .
[64] Mark Johnson,et al. Automatic context-sensitive measurement of the acoustic correlates of distinctive features at landmarks , 1994, ICSLP.
[65] Mark Hasegawa-Johnson,et al. Detecting Non-modal Phonation in Telephone Speech , 2008 .
[66] Mark Hasegawa-Johnson,et al. Intertranscriber reliability of prosodic labeling on telephone conversation using toBI , 2004, INTERSPEECH.
[67] Bernt Schiele,et al. A Semantic Typicality Measure for Natural Scene Categorization , 2004, DAGM-Symposium.
[68] Mark Hasegawa-Johnson,et al. Information theory and variance estimation techniques in the analysis of category rating and paired comparisons , 1997 .
[69] Mark Hasegawa-Johnson,et al. PROSODY AS A CONDITIONING VARIABLE IN SPEECH RECOGNITION , 2003 .
[70] Mark Hasegawa-Johnson,et al. Stop consonant classification by dynamic formant trajectory , 2004, INTERSPEECH.
[71] Mark Hasegawa-Johnson,et al. Frequency and repetition effects outweigh phonetic detail in prominence perception , 2008 .
[72] Mark Hasegawa-Johnson,et al. Formant tracking by mixture state particle filter , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[73] Mark Hasegawa-Johnson,et al. Generalized Optimal Multi-Microphone Speech Enhancement Using Sequential Minimum Variance Distortionless Response(MVDR) Beamforming and Postfiltering , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.
[74] Trevor Darrell,et al. Pyramid Match Kernels: Discriminative Classification with Sets of Image Features (version 2) , 2006 .
[75] Mark Hasegawa-Johnson,et al. Source separation using particle filters , 2004, INTERSPEECH.
[76] M. Johnson,et al. Pitch-orthogonal code-excited LPC , 1990, [Proceedings] GLOBECOM '90: IEEE Global Telecommunications Conference and Exhibition.
[77] Yun Fu,et al. Humanoid Audio–Visual Avatar With Emotive Text-to-Speech Synthesis , 2008, IEEE Transactions on Multimedia.
[78] M. Hasegawa-Johnson,et al. Acoustic Cues to Lexical Stress in Spastic Dysarthria , 2009 .
[79] Mark Hasegawa-Johnson,et al. Joint estimation of DOA and speech based on EM beamforming , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[80] Mark Hasegawa-Johnson,et al. Semi-supervised training of Gaussian mixture models by conditional entropy minimization , 2010, INTERSPEECH.
[81] Thomas Hofmann,et al. Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.
[82] Mark Hasegawa-Johnson. Time-frequency distribution of partial phonetic information measured using mutual information , 2000, INTERSPEECH.
[83] Lae-Hoon Kim,et al. Toward overcoming fundamental limitation in frequency-domain blind source separation for reverberant speech mixtures , 2010, 2010 Conference Record of the Forty Fourth Asilomar Conference on Signals, Systems and Computers.
[84] Mark Hasegawa-Johnson,et al. Maximum mutual information estimation with unlabeled data for phonetic classification , 2008, INTERSPEECH.
[85] Karen Livescu. Articulatory Feature-based Methods for Acoustic and Audio-Visual Speech Recognition : 2006 JHU Summer Workshop Final Report 1 , 2007 .
[86] Stephen E. Levinson,et al. Automatic detection of contrast for speech understanding , 2004, INTERSPEECH.
[87] Mark Hasegawa-Johnson,et al. Voice Quality Dependent Speech Recognition , 2009 .
[88] Mark A. Hasegawa-Johnson,et al. Brain anatomy differences in childhood stuttering , 2008, NeuroImage.
[89] Mark Hasegawa-Johnson,et al. Robust automatic speech recognition with decoder oriented ideal binary mask estimation , 2010, INTERSPEECH.
[90] Mark Hasegawa-Johnson,et al. The entropy of the articulatory phonological code: recognizing gestures from tract variables , 2008, INTERSPEECH.
[91] Mark Hasegawa-Johnson,et al. Acoustic differentiation of L- and L-L% in switchboard and radio news speech , 2006 .
[92] Mark Hasegawa-Johnson,et al. Novel time domain multi-class SVMs for landmark detection , 2006, INTERSPEECH.
[93] Mark Hasegawa-Johnson,et al. Modeling and recognition of phonetic and prosodic factors for improvements to acoustic speech recognition models , 2004, INTERSPEECH.
[94] Cordelia Schmid,et al. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).
[95] Yuxiao Hu,et al. Real-time conversion from a single 2D face image to a 3D text-driven emotive audio-visual avatar , 2008, 2008 IEEE International Conference on Multimedia and Expo.
[96] Harry Shum,et al. Face alignment using statistical models and wavelet features , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..
[97] Simon King,et al. Articulatory Feature-Based Methods for Acoustic and Audio-Visual Speech Recognition: Summary from the 2006 JHU Summer workshop , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[98] Stephen E. Levinson,et al. Semantic analysis for a speech user interface in an intelligent tutoring system , 2004, IUI '04.
[99] Mark Hasegawa-Johnson,et al. Modeling pronunciation variation using artificial neural networks for English spontaneous speech , 2004, INTERSPEECH.
[100] Mark Hasegawa-Johnson,et al. Generalized multi-microphone spectral amplitude estimation based on correlated noise model , 2007 .
[101] Stephen E. Levinson,et al. Extraction of pragmatic and semantic salience from spontaneous spoken English , 2006, Speech Commun..
[102] Ming Liu,et al. AVICAR: audio-visual speech corpus in a car environment , 2004, INTERSPEECH.
[103] Mark Hasegawa-Johnson,et al. Speech enhancement beyond minimum mean squared error with perceptual noise shaping. , 2010 .
[104] Kate Saenko,et al. AUDIOVISUAL SPEECH RECOGNITION WITH ARTICULATOR POSITIONS AS HIDDEN VARIABLES , 2007 .
[105] Jeung-Yoon Choi,et al. Prosody dependent speech recognition on radio news corpus of American English , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[106] Mark Hasegawa-Johnson,et al. A novel algorithm for sparse classification. , 2010 .
[107] Mark Johnson,et al. Online and offline computational reduction techniques using backward filtering in CELP speech coders , 1992, IEEE Trans. Signal Process..
[108] Frank K. Soong,et al. A minimum converted trajectory error (MCTE) approach to high quality speech-to-lips conversion , 2010, INTERSPEECH.
[109] Xiaodan Zhuang,et al. Efficient object localization with gaussianized vector representation , 2009, IMCE '09.
[110] Mark Hasegawa-Johnson,et al. Acoustic model for robustness analysis of optimal multipoint room equalization. , 2008, The Journal of the Acoustical Society of America.
[111] Thomas S. Huang,et al. Dysarthric speech database for universal access research , 2008, INTERSPEECH.
[112] Mark Hasegawa-Johnson,et al. Human speech perception and feature extraction , 2008, INTERSPEECH.
[113] Shuicheng Yan,et al. SIFT-Bag kernel for video event analysis , 2008, ACM Multimedia.
[114] Arthur Kantor,et al. Stream weight tuning in dynamic Bayesian networks , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.
[115] Ming Liu,et al. HMM-Based Acoustic Event Detection with AdaBoost Feature Selection , 2007, CLEAR.
[116] Mark Hasegawa-Johnson,et al. Landmark-based speech recognition: report of the 2004 Johns Hopkins summer workshop , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..
[117] HMM-based Pronunciation Dictionary Generation , 2010 .
[118] Mark Hasegawa-Johnson,et al. On Semi-Supervised Learning of Gaussian Mixture Models for Phonetic Classification , 2009, HLT-NAACL 2009.
[119] Mark Hasegawa-Johnson,et al. Auditory-modeling inspired methods of feature extraction for robust automatic speech recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[120] Trevor Darrell,et al. The pyramid match kernel: discriminative classification with sets of image features , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.
[121] Lixin Fan,et al. Categorizing Nine Visual Classes using Local Appearance Descriptors , 2004 .
[122] Thomas S. Huang,et al. Non-frontal view facial expression recognition based on ergodic hidden Markov model supervectors , 2010, 2010 IEEE International Conference on Multimedia and Expo.
[123] Mark Hasegawa-Johnson,et al. Automatic recognition of pitch movements using multilayer perceptron and time-Delay Recursive neural network , 2004, IEEE Signal Processing Letters.
[124] Bowon Lee. MINIMUM MEAN-SQUARED ERROR A POSTERIORI ESTIMATION OF HIGH VARIANCE VEHICULAR NOISE , .
[125] Mark Hasegawa-Johnson,et al. A factorial HMM approach to simultaneous recognition of isolated digits spoken by multiple talkers on one audio channel , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[126] Marvin Johnson. A mapping between trainable generalized properties and the acoustic correlates of distinctive features , 1993 .
[127] Stephen E. Levinson,et al. Cognitive state classification in a spoken tutorial dialogue system , 2006, Speech Commun..
[128] Ming Liu,et al. Exploring Discriminative Learning for Text-Independent Speaker Recognition , 2007, 2007 IEEE International Conference on Multimedia and Expo.
[129] Marvin Johnson. Using beam elements to model the vocal fold length in breathy voicing. , 1992 .
[130] Mark Hasegawa-Johnson,et al. Model enforcement: a unified feature transformation framework for classification and recognition , 2004, IEEE Transactions on Signal Processing.
[131] Mark Hasegawa-Johnson,et al. A factorial HMM aproach to robust isolated digit recognition in background music , 2004, INTERSPEECH.
[132] Ming Liu,et al. Regression from patch-kernel , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.
[133] M. Hasegawa-Johnson,et al. Automatic Fluency Assessment by Signal-Level Measurement of Spontaneous Speech , 2010 .
[134] Mark Hasegawa-Johnson. Burst spectral measures and formant frequencies can be used to accurately discriminate place of articulation , 1995 .
[135] M. Hasegawa-Johnson,et al. Gaussian mixture models of phonetic boundaries for speech recognition , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..
[136] Mark Hasegawa-Johnson,et al. A procedure for estimating gestural scores from natural speech , 2010, INTERSPEECH.
[137] Mark Hasegawa-Johnson,et al. Acoustic correlates of non‐modal phonation in telephone speech , 2005 .
[138] Mark Hasegawa-Johnson,et al. Maximum mutual information based acoustic-features representation of phonological features for speech recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[139] Stephen E. Levinson,et al. MENTAL STATE DETECTION OF DIALOGUE SYSTEM USERS VIA SPOKEN LANGUAGE , 2003 .
[140] M. Hasegawa-Johnson,et al. Electromagnetic exposure safety of the Carstens articulograph AG100. , 1998, The Journal of the Acoustical Society of America.
[141] Mark Hasegawa-Johnson,et al. Acoustic segmentation using switching state Kalman filter , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..
[142] Thomas S. Huang,et al. Two-stage prosody prediction for emotional text-to-speech synthesis , 2008, INTERSPEECH.
[143] Mark Hasegawa-Johnson,et al. A procedure for estimating gestural scores from articulatory data. , 2010 .
[144] Mark Hasegawa-Johnson,et al. How Prosody Improves Word Recognition , 2004 .
[145] Mark Hasegawa-Johnson,et al. Prosodic Hierarchy as an Organizing Framework for the Sources of Context in Phone-Based and Articulatory-Feature-Based Speech Recognition , 2009 .
[146] M. Hasegawa-Johnson,et al. The effect of accent on the acoustic cues to stop voicing in Radio News speech , 2003 .
[147] Thomas S. Huang,et al. Face age estimation using patch-based hidden Markov model supervectors , 2008, 2008 19th International Conference on Pattern Recognition.
[148] Mark Hasegawa-Johnson,et al. Prosodic effects on vowel production: evidence from formant structure , 2009, INTERSPEECH.
[149] Mark Hasegawa-Johnson,et al. Kinematic analysis of tongue movement control in spastic dysarthria , 2010, INTERSPEECH.
[150] Mark Hasegawa-Johnson,et al. Prosody dependent speech recognition with explicit duration modelling at intonational phrase boundaries , 2003, INTERSPEECH.
[151] Thomas S. Huang,et al. Hmm-Based and Svm-Based Recognition of the Speech of Talkers With Spastic Dysarthria , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.
[152] Yun Fu,et al. EAVA: A 3D Emotive Audio-Visual Avatar , 2008, 2008 IEEE Workshop on Applications of Computer Vision.
[153] Thomas S. Huang,et al. Feature analysis and selection for acoustic event detection , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.
[154] David Harwath. Phonetic Landmark Detection for Automatic Language Identification , 2010 .
[155] Lin Yang,et al. E-coder for Automatic Scoring Physical Activity Diary Data: Development and Validation , 2007 .
[156] Mark Hasegawa-Johnson,et al. Maximum conditional mutual information projection for speech recognition , 2003, INTERSPEECH.
[157] Ming Liu,et al. Frequency domain correspondence for speaker normalization , 2007, INTERSPEECH.
[158] M. Johnson,et al. Improving the performance of CELP-based speech coding at low bit rates , 1991, 1991., IEEE International Sympoisum on Circuits and Systems.
[159] Thomas S. Huang,et al. Toward robust learning of the Gaussian mixture state emission densities for hidden Markov models , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[160] Mark Hasegawa-Johnson,et al. State-Transition Interpolation and MAP Adaptation for HMM-based Dysarthric Speech Recognition , 2010, SLPAT@NAACL.
[161] Mark Hasegawa-Johnson,et al. Non-linear maximum likelihood feature transformation for speech recognition , 2003, INTERSPEECH.
[162] Mark Hasegawa-Johnson,et al. An automatic prosody labeling system using ANN-based syntactic-prosodic model and GMM-based acoustic-prosodic model , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[163] Mark Hasegawa-Johnson,et al. Universal access: speech recognition for talkers with spastic dysarthria , 2009, INTERSPEECH.
[164] Mark Hasegawa-Johnson. A Multi-Stream Approach to Audiovisual Automatic Speech Recognition , 2007, 2007 IEEE 9th Workshop on Multimedia Signal Processing.
[165] M. Hasegawa-Johnson,et al. Strong-sense class-dependent features for statistical recognition , 2003, IEEE Workshop on Statistical Signal Processing, 2003.