Audio classification using attention-augmented convolutional neural network
暂无分享,去创建一个
[1] Wootaek Lim,et al. Speech emotion recognition using convolutional and Recurrent Neural Networks , 2016, 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA).
[2] Jun Du,et al. Hierarchical deep neural network for multivariate regression , 2017, Pattern Recognit..
[3] Ling He,et al. Time-frequency feature extraction from spectrograms and wavelet packets with application to automatic stress and emotion classification in speech , 2009, 2009 7th International Conference on Information, Communications and Signal Processing (ICICS).
[4] Giovanni Costantini,et al. Speech emotion recognition using amplitude modulation parameters and a combined feature selection procedure , 2014, Knowl. Based Syst..
[5] Buket D. Barkana,et al. Deep neural network framework and transformed MFCCs for speaker's age and gender classification , 2017, Knowl. Based Syst..
[6] Zhouyu Fu,et al. Optimizing Cepstral Features for Audio Classification , 2013, IJCAI.
[7] Jun Du. Irrelevant Variability Normalization via Hierarchical Deep Neural Networks for Online Handwritten Chinese Character Recognition , 2014, 2014 14th International Conference on Frontiers in Handwriting Recognition.
[8] Sridha Sridharan,et al. i-vector Based Speaker Recognition on Short Utterances , 2011, INTERSPEECH.
[9] Alex Graves,et al. Recurrent Models of Visual Attention , 2014, NIPS.
[10] Zhang Yi,et al. Foundations of Implementing the Competitive Layer Model by Lotka–Volterra Recurrent Neural Networks , 2010, IEEE Transactions on Neural Networks.
[11] John H. L. Hansen,et al. Unsupervised accent classification for deep data fusion of accent and language information , 2016, Speech Commun..
[12] Douglas A. Reynolds,et al. Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..
[13] Jürgen Schmidhuber,et al. Deep Networks with Internal Selective Attention through Feedback Connections , 2014, NIPS.
[14] Wen Gao,et al. Learning Affective Features With a Hybrid Deep Model for Audio–Visual Emotion Recognition , 2018, IEEE Transactions on Circuits and Systems for Video Technology.
[15] Javier Hernando,et al. Deep belief networks for i-vector based speaker recognition , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[16] Y. X. Zou,et al. An experimental study of speech emotion recognition based on deep convolutional neural networks , 2015, 2015 International Conference on Affective Computing and Intelligent Interaction (ACII).
[17] Hervé Bourlard,et al. A mew ASR approach based on independent processing and recombination of partial frequency bands , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[18] Danilo Comminiello,et al. Benchmarking Functional Link Expansions for Audio Classification Tasks , 2016, Advances in Neural Networks.
[19] Hervé Bourlard,et al. Subband-based speech recognition , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[20] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[21] Masato Akagi,et al. Toward affective speech-to-speech translation: Strategy for emotional speech recognition and synthesis in multiple languages , 2014, Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific.
[22] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[23] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[24] Simon Lucey,et al. Convolutional Sparse Coding for Trajectory Reconstruction , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[25] Tze Fen Li,et al. Speech recognition of mandarin syllables using both linear predict coding cepstra and Mel frequency cepstra , 2007, ROCLING/IJCLCLP.
[26] Dinggang Shen,et al. A Robust Deep Model for Improved Classification of AD/MCI Patients , 2015, IEEE Journal of Biomedical and Health Informatics.
[27] Kah Phooi Seng,et al. A new approach of audio emotion recognition , 2014, Expert Syst. Appl..
[28] Yan Leng,et al. Employing unlabeled data to improve the classification performance of SVM, and its application in audio event classification , 2016, Knowl. Based Syst..
[29] Mina Ibrahim,et al. Improved text-independent speaker identification system for real time applications , 2016, 2016 Fourth International Japan-Egypt Conference on Electronics, Communications and Computers (JEC-ECC).
[30] Ran Chong-sen. Speech Enhancement Using Sub-band Spectral Analysis , 2006 .
[31] Masakiyo Fujimoto,et al. Exploiting spectro-temporal locality in deep learning based acoustic event detection , 2015, EURASIP J. Audio Speech Music. Process..
[32] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.
[33] Margaret Lech,et al. Towards real-time Speech Emotion Recognition using deep neural networks , 2015, 2015 9th International Conference on Signal Processing and Communication Systems (ICSPCS).
[34] Ronald A. Rensink. The Dynamic Representation of Scenes , 2000 .
[35] Erik Cambria,et al. Towards an intelligent framework for multimodal affective data analysis , 2015, Neural Networks.
[36] Honglak Lee,et al. Unsupervised feature learning for audio classification using convolutional deep belief networks , 2009, NIPS.
[37] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.