Complex Linear Projection (CLP): A Discriminative Approach to Joint Feature Extraction and Acoustic Modeling
暂无分享,去创建一个
[1] Tara N. Sainath,et al. Learning the speech front-end with raw waveform CLDNNs , 2015, INTERSPEECH.
[2] D J Field,et al. Relations between the statistics of natural images and the response properties of cortical cells. , 1987, Journal of the Optical Society of America. A, Optics and image science.
[3] M. Picheny,et al. Comparison of Parametric Representation for Monosyllabic Word Recognition in Continuously Spoken Sentences , 2017 .
[4] Hynek Hermansky,et al. Data Driven Design of Filter Bank for Speech Recognition , 2001, TSD.
[5] Boaz Rafaely,et al. Microphone Array Signal Processing , 2008 .
[6] Dong Yu,et al. Exploiting sparseness in deep neural networks for large vocabulary speech recognition , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[7] Hermann Ney,et al. Acoustic modeling with deep neural networks using raw time signal for LVCSR , 2014, INTERSPEECH.
[8] Dimitri Palaz,et al. Analysis of CNN-based speech recognition system using raw speech as input , 2015, INTERSPEECH.
[9] Marc'Aurelio Ranzato,et al. Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.
[10] D. Gabor,et al. Theory of communication. Part 1: The analysis of information , 1946 .
[11] Simon Haykin,et al. Adaptive Signal Processing: Next Generation Solutions , 2010 .
[12] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.
[13] Geoffrey E. Hinton,et al. Learning a better representation of speech soundwaves using restricted boltzmann machines , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Qiang Chen,et al. Network In Network , 2013, ICLR.
[15] Alain Biem,et al. A discriminative filter bank model for speech recognition , 1995, EUROSPEECH.
[16] Tara N. Sainath,et al. Factored spatial and spectral multichannel raw waveform CLDNNs , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Lawrence D. Jackel,et al. Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.
[18] T J Sejnowski,et al. Learning the higher-order structure of a natural sound. , 1996, Network.
[19] Michael S. Lewicki,et al. Efficient coding of natural sounds , 2002, Nature Neuroscience.
[20] Bingbing Ni,et al. Geometric ℓp-norm feature pooling for image classification , 2011, CVPR 2011.
[21] R Linsker,et al. Perceptual neural organization: some approaches based on network models and information theory. , 1990, Annual review of neuroscience.
[22] Dimitri Palaz,et al. Estimating phoneme class conditional probabilities from raw speech signal using convolutional neural networks , 2013, INTERSPEECH.
[23] Hermann Ney,et al. Convolutional neural networks for acoustic modeling of raw time signal in LVCSR , 2015, INTERSPEECH.
[24] Tara N. Sainath,et al. Speaker location and microphone spacing invariant acoustic modeling from raw multichannel waveforms , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[25] Terrence J. Sejnowski,et al. An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.
[26] Tara N. Sainath,et al. Learning filter banks within a deep neural network framework , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[27] Ron J. Weiss,et al. Speech acoustic modeling from raw multichannel waveforms , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[28] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition , 2012 .
[29] A. Aertsen,et al. Spectro-temporal receptive fields of auditory neurons in the grassfrog , 1980, Biological Cybernetics.