论文信息 - Adaptive recognition of different accents conversations based on convolutional neural network - 字舞流文

Adaptive recognition of different accents conversations based on convolutional neural network

Xue Li | Pan Zhang | Jiang Zhong | Jiang Zhong | Pan Zhang | Xue Li | Xue Li

[1] Jean-Luc Gauvain,et al. Multistage speaker diarization of broadcast news , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[2] Lin Wu,et al. Effective Multi-Query Expansions: Collaborative Deep Networks for Robust Landmark Retrieval , 2017, IEEE Transactions on Image Processing.

[3] Joachim Diederich,et al. Accent Classification Using Support Vector Machines , 2007, 6th IEEE/ACIS International Conference on Computer and Information Science (ICIS 2007).

[4] F. Kubala,et al. Automatic Speaker Clustering , 1997 .

[5] M. A. Siegler,et al. Automatic Segmentation, Classification and Clustering of Broadcast News Audio , 1997 .

[6] Lin Wu,et al. Robust Subspace Clustering for Multi-View Data by Exploiting Correlation Consensus , 2015, IEEE Transactions on Image Processing.

[7] D. Reddy,et al. Performance of an expert spectrogram reader , 1978 .

[8] Melvyn C. Reznick. Dialect Zones and Automatic Dialect Identification in Latin American Spanish. , 1969 .

[9] Stan Davis,et al. Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[10] Gu Mingliang,et al. Semi-supervised learning based Chinese dialect identification , 2008, 2008 9th International Conference on Signal Processing.

[11] Xue Li,et al. Deep Attention-Based Spatially Recursive Networks for Fine-Grained Visual Recognition , 2019, IEEE Transactions on Cybernetics.

[12] Alan McCree,et al. Speaker diarization with i-vectors from DNN senone posteriors , 2015, INTERSPEECH.

[13] Douglas A. Reynolds,et al. Experimental evaluation of features for robust speaker identification , 1994, IEEE Trans. Speech Audio Process..

[14] E. B. Newman,et al. A Scale for the Measurement of the Psychological Magnitude Pitch , 1937 .

[15] F. Karray,et al. Speaker Accent Classification System Using a Fuzzy Gaussian Classifier , 2007, 2007 International Conference on Information and Emerging Technologies.

[16] Lin Wu,et al. What-and-Where to Match: Deep Spatially Multiplicative Integration Networks for Person Re-identification , 2017, Pattern Recognit..

[17] Yang Wang,et al. Structured Deep Hashing with Convolutional Neural Networks for Fast Person Re-identification , 2017, Comput. Vis. Image Underst..

[18] S. Speer,et al. Intonation and sentence processing , 2003 .

[19] Lori Lamel,et al. An expert spectrogram reader: A knowledge-based approach to speech recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[20] John H. L. Hansen,et al. Language accent classification in American English , 1996, Speech Commun..

[21] Christian Wellekens,et al. DISTBIC: A speaker-based segmentation for audio data indexing , 2000, Speech Commun..

[22] Mauro Cettolo,et al. Evaluation of BIC-based algorithms for audio segmentation , 2005, Comput. Speech Lang..

[23] Melvyn C. Resnick. Phonological Variants and Dialect Identification in Latin American Spanish , 1980 .

[24] John H. L. Hansen,et al. Discrete-Time Processing of Speech Signals , 1993 .

[25] Ara Samouelian,et al. Knowledge Based Approach To Speech Recognition , 1994 .

[26] S. Chen,et al. Speaker, Environment and Channel Change Detection and Clustering via the Bayesian Information Criterion , 1998 .

[27] Thomas Quatieri,et al. Discrete-Time Speech Signal Processing: Principles and Practice , 2001 .

[28] Tomi Kinnunen,et al. A New Segmentation Algorithm Combined with Transient Frames Power for Text Independent Speaker Verification , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[29] R. P. Ramachandran,et al. Robust speaker recognition: a feature-based approach , 1996, IEEE Signal Processing Magazine.

[30] Christian Wellekens,et al. A speaker tracking system based on speaker turn detection for NIST evaluation , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[31] Ken Perlin,et al. Real-Time Continuous Pose Recovery of Human Hands Using Convolutional Networks , 2014, ACM Trans. Graph..

[32] Xue Li,et al. A Combined Feature Approach for Speaker Segmentation Using Convolution Neural Network , 2017, PCM.

[33] Wooil Kim,et al. Speech Recognition Accuracy Prediction Using Speech Quality Measure , 2016 .

[34] Herbert Gish,et al. Segregation of speakers for speech recognition and speaker identification , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[35] Lin Wu,et al. Iterative Views Agreement: An Iterative Low-Rank Based Structured Optimization Method to Multi-View Spectral Clustering , 2016, IJCAI.

[36] Lin Wu,et al. Deep adaptive feature embedding with local sample distributions for person re-identification , 2017, Pattern Recognit..

[37] Ramesh A. Gopinath,et al. Transcription Of Broadcast News Shows With The Ibm Large Vocabulary Speech Recognition System , 1997 .

[38] Douglas A. Reynolds,et al. An overview of automatic speaker diarization systems , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[39] Lin Wu,et al. Multiview Spectral Clustering via Structured Low-Rank Matrix Factorization , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[40] Hervé Bourlard,et al. Robust speaker change detection , 2004, IEEE Signal Processing Letters.

[41] Noureddine Ellouze,et al. Robust audio speaker segmentation using one class SVMS , 2008, 2008 16th European Signal Processing Conference.