Fast Query-by-Example Speech Search Using Attention-Based Deep Binary Embeddings
暂无分享,去创建一个
[1] Karen Livescu,et al. Deep convolutional acoustic word embeddings using word-pair side information , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.
[3] Tara N. Sainath,et al. Query-by-example keyword spotting using long short-term memory networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[4] Timothy J. Hazen,et al. Query-by-example spoken term detection using phonetic posteriorgram templates , 2009, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding.
[5] Svetlana Lazebnik,et al. Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.
[6] Cheung-Chi Leung,et al. Unsupervised spoken term detection with acoustic segment model , 2011, 2011 International Conference on Speech Database and Assessments (Oriental COCOSDA).
[7] Bhuvana Ramabhadran,et al. Query-by-example Spoken Term Detection For OOV terms , 2009, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding.
[8] Aren Jansen,et al. Fixed-dimensional acoustic embeddings of variable-length segments in low-resource settings , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[9] Jiwen Lu,et al. Learning Compact Binary Descriptors with Unsupervised Deep Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Samy Bengio,et al. Tacotron: Towards End-to-End Speech Synthesis , 2017, INTERSPEECH.
[11] Zhang Zuping,et al. A Hierarchical Structured Self-Attentive Model for Extractive Document Summarization (HSSAS) , 2018, IEEE Access.
[12] Jianmin Wang,et al. Deep Hashing Network for Efficient Similarity Retrieval , 2016, AAAI.
[13] Lianhong Cai,et al. Siamese Recurrent Auto-Encoder Representation for Query-by-Example Spoken Term Detection , 2018, INTERSPEECH.
[14] Piotr Indyk,et al. Similarity Search in High Dimensions via Hashing , 1999, VLDB.
[15] Jan Cernocký,et al. Comparison of methods for language-dependent and language-independent query-by-example spoken term detection , 2012, TOIS.
[16] Hanjiang Lai,et al. Supervised Hashing for Image Retrieval via Image Representation Learning , 2014, AAAI.
[17] Bin Ma,et al. Parallel inference of dirichlet process Gaussian mixture models for unsupervised acoustic modeling: a feasibility study , 2015, INTERSPEECH.
[18] Rodrigo C. Barros,et al. Fast Self-Attentive Multimodal Retrieval , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).
[19] Martin Karafiát,et al. Convolutive Bottleneck Network features for LVCSR , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.
[20] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[21] Kishore Prahallad,et al. Query-by-Example Spoken Term Detection using Frequency Domain Linear Prediction and Non-Segmental Dynamic Time Warping , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[22] Bin Ma,et al. Query-by-Example Speech Search Using Recurrent Neural Acoustic Word Embeddings With Temporal Context , 2019, IEEE Access.
[23] James R. Glass,et al. Unsupervised Pattern Discovery in Speech , 2008, IEEE Transactions on Audio, Speech, and Language Processing.
[24] Jianmin Wang,et al. Deep Visual-Semantic Quantization for Efficient Image Retrieval , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[25] Yoshua Bengio,et al. End-to-end attention-based large vocabulary speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[26] Hung-An Chang,et al. Resource configurable spoken query detection using Deep Boltzmann Machines , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[27] Navdeep Jaitly,et al. Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[28] Bin Ma,et al. Pairwise learning using multi-lingual bottleneck features for low-resource query-by-example spoken term detection , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[29] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[30] Bin Ma,et al. An acoustic segment modeling approach to query-by-example spoken term detection , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[31] Aren Jansen,et al. Efficient spoken term discovery using randomized algorithms , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.
[32] Lin-Shan Lee,et al. Audio Word2Vec: Unsupervised Learning of Audio Segment Representations Using Sequence-to-Sequence Autoencoder , 2016, INTERSPEECH.
[33] Jan Cernocký,et al. Probabilistic and Bottle-Neck Features for LVCSR of Meetings , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[34] Karen Livescu,et al. Multi-view Recurrent Neural Acoustic Word Embeddings , 2016, ICLR.
[35] Jianmin Wang,et al. Deep Quantization Network for Efficient Image Retrieval , 2016, AAAI.
[36] Cheng Deng,et al. Unsupervised Deep Generative Adversarial Hashing Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[37] Trevor Darrell,et al. Learning to Hash with Binary Reconstructive Embeddings , 2009, NIPS.
[38] Bowen Zhou,et al. A Structured Self-attentive Sentence Embedding , 2017, ICLR.
[39] David J. Fleet,et al. Minimal Loss Hashing for Compact Binary Codes , 2011, ICML.
[40] Aren Jansen,et al. Segmental acoustic indexing for zero resource keyword search , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[41] Christopher D. Manning,et al. Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.
[42] Lin-Shan Lee,et al. Segmental Audio Word2Vec: Representing Utterances as Sequences of Vectors with Applications in Spoken Term Detection , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[43] Jia Wang,et al. Unsupervised Triplet Hashing for Fast Image Retrieval , 2017, ACM Multimedia.
[44] George Saon,et al. Advancing Sequence-to-Sequence Based Speech Recognition , 2019, INTERSPEECH.
[45] Hung-yi Lee,et al. Query-by-Example Spoken Term Detection Using Attention-Based Multi-Hop Networks , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[46] Karen Livescu,et al. Discriminative acoustic word embeddings: Tecurrent neural network-based approaches , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).
[47] Shujie Liu,et al. Neural Speech Synthesis with Transformer Network , 2018, AAAI.
[48] Lin-Shan Lee,et al. Model-Based Unsupervised Spoken Term Detection with Spoken Queries , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[49] Daniel Povey,et al. Self-Attentive Speaker Embeddings for Text-Independent Speaker Verification , 2018, INTERSPEECH.
[50] Navdeep Jaitly,et al. Towards End-To-End Speech Recognition with Recurrent Neural Networks , 2014, ICML.
[51] Hanjiang Lai,et al. Simultaneous feature learning and hash coding with deep neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[52] Antonio Torralba,et al. Spectral Hashing , 2008, NIPS.
[53] James R. Glass,et al. Spoken Content Retrieval—Beyond Cascading Speech Recognition with Text Retrieval , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[54] Hwee Tou Ng,et al. A lattice-based approach to query-by-example spoken document retrieval , 2008, SIGIR '08.
[55] Rongrong Ji,et al. Supervised hashing with kernels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[56] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[57] Yiwei Zhou,et al. Clickbait Detection in Tweets Using Self-attentive Network , 2017, ArXiv.
[58] James R. Glass,et al. Towards multi-speaker unsupervised speech pattern discovery , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[59] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[60] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[61] Bin Ma,et al. Using parallel tokenizers with DTW matrix combination for low-resource spoken term detection , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[62] James R. Glass,et al. Unsupervised spoken keyword spotting via segmental DTW on Gaussian posteriorgrams , 2009, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding.
[63] Yoshua Bengio,et al. Attention-Based Models for Speech Recognition , 2015, NIPS.
[64] Yoshua Bengio,et al. End-to-end Continuous Speech Recognition using Attention-based Recurrent NN: First Results , 2014, ArXiv.
[65] Tara N. Sainath,et al. State-of-the-Art Speech Recognition with Sequence-to-Sequence Models , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[66] Haizhou Li,et al. Multitask Feature Learning for Low-Resource Query-by-Example Spoken Term Detection , 2017, IEEE Journal of Selected Topics in Signal Processing.
[67] Navdeep Jaitly,et al. Hybrid speech recognition with Deep Bidirectional LSTM , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[68] Karen Livescu,et al. Query-by-Example Search with Discriminative Neural Acoustic Word Embeddings , 2017, INTERSPEECH.