Siamese Style Convolutional Neural Networks for Sound Search by Vocal Imitation
暂无分享,去创建一个
[1] Honglak Lee,et al. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.
[2] C. Gross. Genealogy of the “Grandmother Cell” , 2002, The Neuroscientist : a review journal bringing neurobiology, neurology and psychiatry.
[3] Brian Christopher Smith,et al. Query by humming: musical information retrieval in an audio database , 1995, MULTIMEDIA '95.
[4] Zafar Rafii,et al. An audio fingerprinting system for live version identification using image processing techniques , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Gaël Richard,et al. Drum Loops Retrieval from Spoken Queries , 2005, Journal of Intelligent Information Systems.
[6] Yann LeCun,et al. Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).
[7] Zhiyao Duan,et al. IMISOUND: An Unsupervised System for Sound Query by Vocal Imitation , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] Hwee Tou Ng,et al. A lattice-based approach to query-by-example spoken document retrieval , 2008, SIGIR '08.
[9] Jordi Janer,et al. Sound Retrieval From Voice Imitation Queries In Collaborative Databases , 2014, Semantic Audio.
[10] Avery Wang,et al. An Industrial Strength Audio Search Algorithm , 2003, ISMIR.
[11] Aaron C. Courville,et al. Understanding Representations Learned in Deep Architectures , 2010 .
[12] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.
[13] R. A. Leibler,et al. On Information and Sufficiency , 1951 .
[14] Thierry Bertin-Mahieux,et al. Large-scale cover song recognition using hashed chroma landmarks , 2011, 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).
[15] Xavier Serra,et al. Querying Freesound with a microphone , 2015 .
[16] Justin Salamon,et al. Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification , 2016, IEEE Signal Processing Letters.
[17] George Tzanetakis,et al. A comparative evaluation of search techniques for query-by-humming using the MUSART testbed , 2007 .
[18] Jae Lim,et al. Signal estimation from modified short-time Fourier transform , 1984 .
[19] Patrick Susini,et al. The Timbre Toolbox: extracting audio descriptors from musical signals. , 2011, The Journal of the Acoustical Society of America.
[20] Bryan Pardo,et al. VocalSketch: Vocally Imitating Audio Concepts , 2015, CHI.
[21] Gregory R. Koch,et al. Siamese Neural Networks for One-Shot Image Recognition , 2015 .
[22] Matias Lindgren,et al. Deep learning for spoken language identification , 2020 .
[23] YICHI ZHANG,et al. Supervised and Unsupervised Sound Retrieval by Vocal Imitation , 2016 .
[24] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.
[25] Moshé M. Zloof. Query-by-Example: A Data Base Language , 1977, IBM Syst. J..
[26] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[27] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[28] D. Opitz,et al. Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..
[29] Zhiyao Duan,et al. Retrieving sounds by vocal imitation recognition , 2015, 2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP).
[30] Ajay Kapur,et al. Query-by-Beat-Boxing: Music Retrieval For The DJ , 2004, ISMIR.
[31] Antoine Liutkus,et al. A Multi-resolution approach to Common Fate-based audio separation , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[32] Harris Wu,et al. Evaluating Web-based Question Answering Systems , 2002, LREC.
[33] Ke Chen,et al. Extracting Speaker-Specific Information with a Regularized Siamese Deep Network , 2011, NIPS.
[34] Meinard Müller,et al. Known Artist Live Song ID: A Hashprint Approach , 2016, ISMIR.
[35] S. Chiba,et al. Dynamic programming algorithm optimization for spoken word recognition , 1978 .
[36] Christian Schörkhuber. CONSTANT-Q TRANSFORM TOOLBOX FOR MUSIC PROCESSING , 2010 .
[37] Luca Bertinetto,et al. Fully-Convolutional Siamese Networks for Object Tracking , 2016, ECCV Workshops.
[38] Zhiyao Duan,et al. IMINET: Convolutional semi-siamese networks for sound search by vocal imitation , 2017, 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).
[39] Rahul Sukthankar,et al. MatchNet: Unifying feature and metric learning for patch-based matching , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[40] Yann LeCun,et al. Signature Verification Using A "Siamese" Time Delay Neural Network , 1993, Int. J. Pattern Recognit. Artif. Intell..
[41] Tuomas Virtanen,et al. Audio Query by Example Using Similarity Measures between Probability Density Functions of Features , 2010, EURASIP J. Audio Speech Music. Process..
[42] Powen Ru,et al. Multiresolution spectrotemporal analysis of complex sounds. , 2005, The Journal of the Acoustical Society of America.
[43] G. Montavon. Deep learning for spoken language identification , 2009 .
[44] Zhiyao Duan,et al. Visualization and Interpretation of Siamese Style Convolutional Neural Networks for Sound Search by Vocal Imitation , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).