An Analysis/Synthesis Framework for Automatic F0 Annotation of Multitrack Datasets

Comunicacio presentada a: ISMIR 2017, celebrat a Suzhou, Xina, del 23 al 27 d'octubre de 2017.

[1]  Tuomas Eerola,et al.  Modeling musical attributes to characterize ensemble recordings using rhythmic audio features , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2]  Xavier Serra,et al.  ISMIR 2004 Audio Description Contest , 2006 .

[3]  Axel Röbel,et al.  A Morphological Model for Simulating Acoustic Scenes and Its Application to Sound Event Detection , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[4]  Simon Dixon,et al.  Computer-aided Melody Note Transcription Using the Tony Software: Accuracy and Efficiency , 2015 .

[5]  Daniel P. W. Ellis,et al.  MIR_EVAL: A Transparent Implementation of Common MIR Metrics , 2014, ISMIR.

[6]  Emilia Gómez,et al.  A Comparison of Melody Extraction Methods Based on Source-Filter Modelling , 2016, ISMIR.

[7]  Antoine Liutkus,et al.  Gaussian Processes for Underdetermined Source Separation , 2011, IEEE Transactions on Signal Processing.

[8]  Juan Pablo Bello,et al.  A Software Framework for Musical Data Augmentation , 2015, ISMIR.

[9]  Gaël Richard,et al.  Source/Filter Model for Unsupervised Main Melody Extraction From Polyphonic Audio Signals , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[10]  Geoffrey Zweig,et al.  Advances in speech transcription at IBM under the DARPA EARS program , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Malcolm Slaney,et al.  Acoustic Chord Transcription and Key Extraction From Audio Using Key-Dependent HMMs Trained on Synthesized Audio , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[12]  Mark B. Sandler,et al.  Structural Segmentation of Multitrack Audio , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[13]  Jeffrey J. Scott,et al.  Instrument Identification Informed Multi-Track Mixing , 2013, ISMIR.

[14]  Daniel P. W. Ellis,et al.  A Discriminative Model for Polyphonic Piano Transcription , 2007, EURASIP J. Adv. Signal Process..

[15]  Anssi Klapuri,et al.  Multiple fundamental frequency estimation based on harmonicity and spectral smoothness , 2003, IEEE Trans. Speech Audio Process..

[16]  Slim Essid,et al.  Melody Extraction by Contour Classification , 2015, ISMIR.

[17]  Peter Grosche,et al.  High resolution audio synchronization using chroma onset features , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[18]  Fabian J. Theis,et al.  The signal separation evaluation campaign (2007-2010): Achievements and remaining challenges , 2012, Signal Process..

[19]  Yi-Hsuan Yang,et al.  Vocal activity informed singing voice separation with the iKala dataset , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[20]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  J. Stephen Downie,et al.  The music information retrieval evaluation exchange (2005-2007): A window into music information retrieval research , 2008, Acoustical Science and Technology.

[22]  Matthias Mauch,et al.  MedleyDB: A Multitrack Dataset for Annotation-Intensive MIR Research , 2014, ISMIR.

[23]  Xavier Serra,et al.  Evaluation in Music Information Retrieval , 2013, Journal of Intelligent Information Systems.

[24]  Joachim Fritsch,et al.  High Quality Musical Audio Source Separation , 2012 .

[25]  Justin Salamon,et al.  Deep Salience Representations for F0 Estimation in Polyphonic Music , 2017, ISMIR.

[26]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[27]  Daniel P. W. Ellis,et al.  Melody Extraction from Polyphonic Music Signals: Approaches, applications, and challenges , 2014, IEEE Signal Processing Magazine.

[28]  Jordi Bonada WIDE-BAND HARMONIC SINUSOIDAL MODELING , 2008 .

[29]  Simon Dixon,et al.  PYIN: A fundamental frequency estimator using probabilistic threshold distributions , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[30]  Changshui Zhang,et al.  Multiple Fundamental Frequency Estimation by Modeling Spectral Peaks and Non-Peak Regions , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[31]  Joshua D. Reiss,et al.  Intelligent systems for mixing multichannel audio , 2011, 2011 17th International Conference on Digital Signal Processing (DSP).

[32]  Jyh-Shing Roger Jang,et al.  On the Improvement of Singing Voice Separation for Monaural Recordings Using the MIR-1K Dataset , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[33]  Mert Bay,et al.  Evaluation of Multiple-F0 Estimation and Tracking Systems , 2009, ISMIR.

[34]  Yi-Hsuan Yang,et al.  Escaping from the Abyss of Manual Annotation: New Methodology of Building Polyphonic Datasets for Automatic Music Transcription , 2015, CMMR.

[35]  Tillman Weyde,et al.  Shift-Invariant Model for Polyphonic Music Transcription , 2013 .

[36]  Roland Badeau,et al.  Multipitch Estimation of Piano Sounds Using a New Probabilistic Spectral Smoothness Principle , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[37]  Jeffrey J. Scott,et al.  AUTOMATIC MULTI-TRACK MIXING USING LINEAR DYNAMICAL SYSTEMS , 2011 .

[38]  Emilia Gómez,et al.  Melody Extraction From Polyphonic Music Signals Using Pitch Contour Characteristics , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[39]  Thierry Bertin-Mahieux,et al.  The Million Song Dataset , 2011, ISMIR.

[40]  Emilia Gómez,et al.  Towards Computer-Assisted Flamenco Transcription: An Experimental Comparison of Automatic Transcription Algorithms as Applied to A Cappella Singing , 2013, Computer Music Journal.

[41]  Hideki Kawahara,et al.  YIN, a fundamental frequency estimator for speech and music. , 2002, The Journal of the Acoustical Society of America.