Adaptive recognition of different accents conversations based on convolutional neural network
暂无分享,去创建一个
Xue Li | Pan Zhang | Jiang Zhong | Jiang Zhong | Pan Zhang | Xue Li | Xue Li
[1] Jean-Luc Gauvain,et al. Multistage speaker diarization of broadcast news , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[2] Lin Wu,et al. Effective Multi-Query Expansions: Collaborative Deep Networks for Robust Landmark Retrieval , 2017, IEEE Transactions on Image Processing.
[3] Joachim Diederich,et al. Accent Classification Using Support Vector Machines , 2007, 6th IEEE/ACIS International Conference on Computer and Information Science (ICIS 2007).
[4] F. Kubala,et al. Automatic Speaker Clustering , 1997 .
[5] M. A. Siegler,et al. Automatic Segmentation, Classification and Clustering of Broadcast News Audio , 1997 .
[6] Lin Wu,et al. Robust Subspace Clustering for Multi-View Data by Exploiting Correlation Consensus , 2015, IEEE Transactions on Image Processing.
[7] D. Reddy,et al. Performance of an expert spectrogram reader , 1978 .
[8] Melvyn C. Reznick. Dialect Zones and Automatic Dialect Identification in Latin American Spanish. , 1969 .
[9] Stan Davis,et al. Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .
[10] Gu Mingliang,et al. Semi-supervised learning based Chinese dialect identification , 2008, 2008 9th International Conference on Signal Processing.
[11] Xue Li,et al. Deep Attention-Based Spatially Recursive Networks for Fine-Grained Visual Recognition , 2019, IEEE Transactions on Cybernetics.
[12] Alan McCree,et al. Speaker diarization with i-vectors from DNN senone posteriors , 2015, INTERSPEECH.
[13] Douglas A. Reynolds,et al. Experimental evaluation of features for robust speaker identification , 1994, IEEE Trans. Speech Audio Process..
[14] E. B. Newman,et al. A Scale for the Measurement of the Psychological Magnitude Pitch , 1937 .
[15] F. Karray,et al. Speaker Accent Classification System Using a Fuzzy Gaussian Classifier , 2007, 2007 International Conference on Information and Emerging Technologies.
[16] Lin Wu,et al. What-and-Where to Match: Deep Spatially Multiplicative Integration Networks for Person Re-identification , 2017, Pattern Recognit..
[17] Yang Wang,et al. Structured Deep Hashing with Convolutional Neural Networks for Fast Person Re-identification , 2017, Comput. Vis. Image Underst..
[18] S. Speer,et al. Intonation and sentence processing , 2003 .
[19] Lori Lamel,et al. An expert spectrogram reader: A knowledge-based approach to speech recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[20] John H. L. Hansen,et al. Language accent classification in American English , 1996, Speech Commun..
[21] Christian Wellekens,et al. DISTBIC: A speaker-based segmentation for audio data indexing , 2000, Speech Commun..
[22] Mauro Cettolo,et al. Evaluation of BIC-based algorithms for audio segmentation , 2005, Comput. Speech Lang..
[23] Melvyn C. Resnick. Phonological Variants and Dialect Identification in Latin American Spanish , 1980 .
[24] John H. L. Hansen,et al. Discrete-Time Processing of Speech Signals , 1993 .
[25] Ara Samouelian,et al. Knowledge Based Approach To Speech Recognition , 1994 .
[26] S. Chen,et al. Speaker, Environment and Channel Change Detection and Clustering via the Bayesian Information Criterion , 1998 .
[27] Thomas Quatieri,et al. Discrete-Time Speech Signal Processing: Principles and Practice , 2001 .
[28] Tomi Kinnunen,et al. A New Segmentation Algorithm Combined with Transient Frames Power for Text Independent Speaker Verification , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[29] R. P. Ramachandran,et al. Robust speaker recognition: a feature-based approach , 1996, IEEE Signal Processing Magazine.
[30] Christian Wellekens,et al. A speaker tracking system based on speaker turn detection for NIST evaluation , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[31] Ken Perlin,et al. Real-Time Continuous Pose Recovery of Human Hands Using Convolutional Networks , 2014, ACM Trans. Graph..
[32] Xue Li,et al. A Combined Feature Approach for Speaker Segmentation Using Convolution Neural Network , 2017, PCM.
[33] Wooil Kim,et al. Speech Recognition Accuracy Prediction Using Speech Quality Measure , 2016 .
[34] Herbert Gish,et al. Segregation of speakers for speech recognition and speaker identification , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.
[35] Lin Wu,et al. Iterative Views Agreement: An Iterative Low-Rank Based Structured Optimization Method to Multi-View Spectral Clustering , 2016, IJCAI.
[36] Lin Wu,et al. Deep adaptive feature embedding with local sample distributions for person re-identification , 2017, Pattern Recognit..
[37] Ramesh A. Gopinath,et al. Transcription Of Broadcast News Shows With The Ibm Large Vocabulary Speech Recognition System , 1997 .
[38] Douglas A. Reynolds,et al. An overview of automatic speaker diarization systems , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[39] Lin Wu,et al. Multiview Spectral Clustering via Structured Low-Rank Matrix Factorization , 2017, IEEE Transactions on Neural Networks and Learning Systems.
[40] Hervé Bourlard,et al. Robust speaker change detection , 2004, IEEE Signal Processing Letters.
[41] Noureddine Ellouze,et al. Robust audio speaker segmentation using one class SVMS , 2008, 2008 16th European Signal Processing Conference.