Learning-Based Methods for Comparing Sequences, with Applications to Audio-to-MIDI Alignment and Matching
暂无分享,去创建一个
[1] M. Kendall. A NEW MEASURE OF RANK CORRELATION , 1938 .
[2] A. Bhattacharyya. On a measure of divergence between two statistical populations defined by their probability distributions , 1943 .
[3] J. Knott. The organization of behavior: A neuropsychological theory , 1951 .
[4] F ROSENBLATT,et al. The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.
[5] Boris Polyak. Some methods of speeding up the convergence of iteration methods , 1964 .
[6] J. Kruskal. Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis , 1964 .
[7] Peter E. Hart,et al. Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.
[8] S. Chiba,et al. Dynamic programming algorithm optimization for spoken word recognition , 1978 .
[9] J. Blauert. Spatial Hearing: The Psychophysics of Human Sound Localization , 1983 .
[10] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.
[11] F. Richard Moore,et al. The Dysfunctions of MIDI , 1988, ICMC.
[12] Lorien Y. Pratt,et al. Comparing Biases for Minimal Network Construction with Back-Propagation , 1988, NIPS.
[13] George Cybenko,et al. Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..
[14] Lawrence D. Jackel,et al. Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.
[15] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.
[16] W S McCulloch,et al. A logical calculus of the ideas immanent in nervous activity , 1990, The Philosophy of Artificial Intelligence.
[17] Paul J. Werbos,et al. Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.
[18] Judith C. Brown. Calculation of a constant Q spectral transform , 1991 .
[19] Judith C. Brown,et al. An efficient algorithm for the calculation of a constant Q transform , 1992 .
[20] Donald J. Berndt,et al. Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.
[21] Yoshua Bengio,et al. Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.
[22] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[23] Perry R. Cook,et al. Music, cognition, and computerized sound: an introduction to psychoacoustics , 1999 .
[24] Takuya Fujishima,et al. Realtime Chord Recognition of Musical Sound: a System Using Common Lisp Music , 1999, ICMC.
[25] Chi Lap Yip,et al. Selection of melody lines for music databases , 2000, Proceedings 24th Annual International Computer Software and Applications Conference. COMPSAC2000.
[26] Eric Jones,et al. SciPy: Open Source Scientific Tools for Python , 2001 .
[27] Michael Good,et al. MusicXML for notation and analysis , 2001 .
[28] Simon Dixon,et al. Automatic Extraction of Tempo and Beat From Expressive Performances , 2001 .
[29] Eamonn J. Keogh,et al. Dimensionality Reduction for Fast Similarity Search in Large Time Series Databases , 2001, Knowledge and Information Systems.
[30] George Tzanetakis,et al. Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..
[31] George Tzanetakis,et al. Polyphonic audio matching and alignment for music retrieval , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).
[32] Daniel P. W. Ellis,et al. Ground-truth transcriptions of real music from force-aligned MIDI syntheses , 2003, ISMIR.
[33] Marina Bosi,et al. Introduction to Digital Audio Coding and Standards , 2004, J. Electronic Imaging.
[34] Eamonn J. Keogh,et al. Everything you know about Dynamic Time Warping is Wrong , 2004 .
[35] Kunihiko Fukushima,et al. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.
[36] Daniel P. W. Ellis,et al. A Large-Scale Evaluation of Acoustic and Subjective Music-Similarity Measures , 2004, Computer Music Journal.
[37] Gerhard Widmer,et al. MATCH: A Music Alignment Tool Chest , 2005, ISMIR.
[38] Daniel P. W. Ellis,et al. Song-Level Features and Support Vector Machines for Music Classification , 2005, ISMIR.
[39] Mark B. Sandler,et al. A tutorial on onset detection in music signals , 2005, IEEE Transactions on Speech and Audio Processing.
[40] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.
[41] Ichiro Fujinaga,et al. jSymbolic: A Feature Extractor for MIDI Files , 2006, ICMC.
[42] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.
[43] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[44] Mark Sandler,et al. Signal Processing Parameters for Tonality Estimation , 2007 .
[45] Philip Chan,et al. Toward accurate dynamic time warping in linear time and space , 2007, Intell. Data Anal..
[46] Smith,et al. Mathematics of the Discrete Fourier Transform (DFT) with Audio Applications , 2007 .
[47] Thomas Hofmann,et al. Greedy Layer-Wise Training of Deep Networks , 2007 .
[48] Gert R. G. Lanckriet,et al. Towards musical query-by-semantic-description using the CAL500 data set , 2007, SIGIR.
[49] Daniel P. W. Ellis,et al. A Discriminative Model for Polyphonic Piano Transcription , 2007, EURASIP J. Adv. Signal Process..
[50] Daniel Müllensiefen,et al. Bayesian Model Selection for Harmonic Labelling , 2007 .
[51] Yoshua. Bengio,et al. Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..
[52] Praveen Paritosh,et al. Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.
[53] David Rizo,et al. Mining Digital Music Score Collections: Melody Extraction and Genre Recognition , 2008 .
[54] J. Stephen Downie,et al. The music information retrieval evaluation exchange (2005-2007): A window into music information retrieval research , 2008, Acoustical Science and Technology.
[55] Alex Graves,et al. Supervised Sequence Labelling with Recurrent Neural Networks , 2012, Studies in Computational Intelligence.
[56] Stephen Cranefield,et al. A Study on Feature Analysis for Musical Instrument Classification , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[57] Yann LeCun,et al. What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.
[58] Orberto,et al. Evaluation Methods for Musical Audio Beat Tracking Algorithms , 2009 .
[59] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.
[60] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[61] Pascal Vincent,et al. The Difficulty of Training Deep Architectures and the Effect of Unsupervised Pre-Training , 2009, AISTATS.
[62] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[63] Christian Schörkhuber. CONSTANT-Q TRANSFORM TOOLBOX FOR MUSIC PROCESSING , 2010 .
[64] Youngmoo E. Kim,et al. Exploring automatic music annotation with "acoustically-objective" tags , 2010, MIR '10.
[65] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[66] Thierry Bertin-Mahieux,et al. Clustering Beat-Chroma Patterns in a Large Music Database , 2010, ISMIR.
[67] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[68] Luca Maria Gambardella,et al. Deep, Big, Simple Neural Nets for Handwritten Digit Recognition , 2010, Neural Computation.
[69] Nando de Freitas,et al. A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.
[70] Christopher Ariza,et al. Music21: A Toolkit for Computer-Aided Musicology and Symbolic Music Data , 2010, ISMIR.
[71] Ilya Sutskever,et al. Learning Recurrent Neural Networks with Hessian-Free Optimization , 2011, ICML.
[72] Thierry Bertin-Mahieux,et al. The Million Song Dataset , 2011, ISMIR.
[73] Yoshua Bengio,et al. Deep Sparse Rectifier Neural Networks , 2011, AISTATS.
[74] Yoshua Bengio,et al. On the Expressive Power of Deep Architectures , 2011, ALT.
[75] Dimitrios Gunopulos,et al. Embedding-based subsequence matching in time-series databases , 2011, TODS.
[76] Christopher Ariza,et al. Feature Extraction and Machine Learning on Symbolic Music using the music21 Toolkit , 2011, ISMIR.
[77] Simon Dixon,et al. A Corpus-based Study of Rhythm Patterns , 2012, ISMIR.
[78] Nitish Srivastava,et al. Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.
[79] Yoshua Bengio,et al. Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..
[80] Jasper Snoek,et al. Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.
[81] Andreas Rauber,et al. Facilitating Comprehensive Benchmarking Experiments on the Million Song Dataset , 2012, ISMIR.
[82] Herbert Jaeger,et al. Long Short-Term Memory in Echo State Networks: Details of a Simulation Study , 2012 .
[83] Yoshua Bengio,et al. Practical Recommendations for Gradient-Based Training of Deep Architectures , 2012, Neural Networks: Tricks of the Trade.
[84] Marián Boguñá,et al. Measuring the Evolution of Contemporary Western Popular Music , 2012, Scientific Reports.
[85] Eamonn J. Keogh,et al. Searching and Mining Trillions of Time Series Subsequences under Dynamic Time Warping , 2012, KDD.
[86] Razvan Pascanu,et al. Theano: new features and speed improvements , 2012, ArXiv.
[87] Gerald Penn,et al. Applying Convolutional Neural Networks concepts to hybrid NN-HMM model for speech recognition , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[88] Juan Pablo Bello,et al. Rethinking Automatic Chord Recognition with Convolutional Neural Networks , 2012, 2012 11th International Conference on Machine Learning and Applications.
[89] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[90] Meinard Müller,et al. Towards Cross-Version Harmonic Analysis of Music , 2012, IEEE Transactions on Multimedia.
[91] Thierry Bertin-Mahieux,et al. Large-Scale Cover Song Recognition Using the 2D Fourier Transform Magnitude , 2012, ISMIR.
[92] Pascal Vincent,et al. Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[93] Gerhard Widmer,et al. Automatic Alignment of Music Performances with Structural Differences , 2013, ISMIR.
[94] Andrew L. Maas. Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .
[95] S. Dixon,et al. MIREX 2019: VAMP PLUGINS FROM THE CENTRE FOR DIGITAL MUSIC , 2013 .
[96] Tara N. Sainath,et al. Deep convolutional neural networks for LVCSR , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[97] Razvan Pascanu,et al. On the difficulty of training recurrent neural networks , 2012, ICML.
[98] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[99] Alex Graves,et al. Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.
[100] Geoffrey E. Hinton,et al. On the importance of initialization and momentum in deep learning , 2013, ICML.
[101] Sida I. Wang,et al. Dropout Training as Adaptive Regularization , 2013, NIPS.
[102] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[103] Georg Heigold,et al. Word embeddings for speech recognition , 2014, INTERSPEECH.
[104] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.
[105] Jürgen Schmidhuber,et al. Multimodal Similarity-Preserving Hashing , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[106] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.
[107] Thomas Grill,et al. Boundary Detection in Music Structure Analysis using Convolutional Neural Networks , 2014, ISMIR.
[108] Florian Krebs,et al. A Multi-model Approach to Beat Tracking Considering Heterogeneous Music Styles , 2014, ISMIR.
[109] Simon Dixon,et al. Sequential Complexity as a Descriptor for Musical Similarity , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[110] Daniel P. W. Ellis,et al. MIR_EVAL: A Transparent Implementation of Common MIR Metrics , 2014, ISMIR.
[111] Tom Schaul,et al. Unit Tests for Stochastic Optimization , 2013, ICLR.
[112] Surya Ganguli,et al. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks , 2013, ICLR.
[113] Mark D. Plumbley,et al. Score-Informed Source Separation for Musical Audio Recordings: An overview , 2014, IEEE Signal Processing Magazine.
[114] Alex Graves,et al. Neural Turing Machines , 2014, ArXiv.
[115] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[116] Sebastian Böck,et al. Improved musical onset detection with Convolutional Neural Networks , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[117] Harm de Vries,et al. RMSProp and equilibrated adaptive learning rates for non-convex optimization. , 2015 .
[118] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[119] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[120] Jason Weston,et al. End-To-End Memory Networks , 2015, NIPS.
[121] James Philbin,et al. FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[122] Yoshua Bengio,et al. Describing Multimedia Content Using Attention-Based Encoder-Decoder Networks , 2015, IEEE Transactions on Multimedia.
[123] Chu-Song Chen,et al. Supervised Learning of Semantics-Preserving Hashing via Deep Neural Networks for Large-Scale Image Search , 2015, ArXiv.
[124] Colin Raffel,et al. librosa: v0.4.0 , 2015 .
[125] Hendrik Schreiber,et al. Improving Genre Annotations for the Million Song Dataset , 2015, ISMIR.
[126] Marc'Aurelio Ranzato,et al. Learning Longer Memory in Recurrent Neural Networks , 2014, ICLR.
[127] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[128] Alexander Mordvintsev,et al. Inceptionism: Going Deeper into Neural Networks , 2015 .
[129] Juan Pablo Bello,et al. A Software Framework for Musical Data Augmentation , 2015, ISMIR.
[130] Daniel P. W. Ellis,et al. Large-Scale Content-Based Matching of MIDI and Audio Files , 2015, ISMIR.
[131] Simon Dixon,et al. An End-to-End Neural Network for Polyphonic Music Transcription , 2015, ArXiv.
[132] Geoffrey E. Hinton,et al. A Simple Way to Initialize Recurrent Networks of Rectified Linear Units , 2015, ArXiv.
[133] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[134] Quoc V. Le,et al. Listen, Attend and Spell , 2015, ArXiv.
[135] Thomas Grill,et al. Exploring Data Augmentation for Improved Singing Voice Detection with Neural Networks , 2015, ISMIR.
[136] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[137] Colin Raffel,et al. librosa: Audio and Music Signal Analysis in Python , 2015, SciPy.
[138] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[139] Xiang Zhang,et al. Text Understanding from Scratch , 2015, ArXiv.
[140] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[141] Daniel P. W. Ellis,et al. Optimizing DTW-based audio-to-MIDI alignment and matching , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[142] Zhuo Chen,et al. Deep clustering: Discriminative embeddings for segmentation and separation , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[143] Karen Livescu,et al. Deep convolutional acoustic word embeddings using word-pair side information , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[144] Daniel P. W. Ellis,et al. Pruning subsequence search with attention-based embedding , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[145] Colin Raffel. Accelerating Multimodal Sequence Retrieval with Convolutional Networks , 2016 .
[146] Sebastian Ruder,et al. An overview of gradient descent optimization algorithms , 2016, Vestnik komp'iuternykh i informatsionnykh tekhnologii.
[147] Yoshua Bengio,et al. End-to-end attention-based large vocabulary speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[148] Jiri Matas,et al. All you need is a good init , 2015, ICLR.
[149] Daniel P. W. Ellis,et al. Extracting Ground-Truth Information from MIDI Files: A MIDIfesto , 2016, ISMIR.
[150] Francesco Visin,et al. A guide to convolution arithmetic for deep learning , 2016, ArXiv.
[151] Charu C. Aggarwal,et al. Neural Networks and Deep Learning , 2018, Springer International Publishing.