Transfer Learning with Jukebox for Music Source Separation

In this work, we demonstrate how to adapt a publicly available pre-trained Jukebox model for the problem of audio source separation from a single mixed audio channel. Our neural network architecture for transfer learning is fast to train and results demonstrate comparable performance to other state-of-the-art approaches. We provide an open-source code implementation of our architecture (rebrand.ly/transfer-jukebox-github).

[1]  Aren Jansen,et al.  Towards Learning a Universal Non-Semantic Representation of Speech , 2020, INTERSPEECH.

[2]  W. David Hairston,et al.  Systems, Subjects, Sessions: To What Extent Do These Factors Influence EEG Data? , 2017, Front. Hum. Neurosci..

[3]  Weiguo Fan,et al.  A new image classification method using CNN transfer learning and web data augmentation , 2018, Expert Syst. Appl..

[4]  Romain Hennequin,et al.  Spleeter: a fast and efficient music source separation tool with pre-trained models , 2020, J. Open Source Softw..

[5]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[6]  Helge Ritter,et al.  Critic Guided Segmentation of Rewarding Objects in First-Person Views , 2021, KI.

[7]  Simon Dixon,et al.  Wave-U-Net: A Multi-Scale Neural Network for End-to-End Audio Source Separation , 2018, ISMIR.

[8]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[9]  Fabian-Robert Stöter,et al.  Music Demixing Challenge at ISMIR 2021 , 2021, ArXiv.

[10]  Khalid M. Mosalam,et al.  Deep Transfer Learning for Image‐Based Structural Damage Recognition , 2018, Comput. Aided Civ. Infrastructure Eng..

[11]  Nicolas Usunier,et al.  Demucs: Deep Extractor for Music Sources with extra unlabeled data remixed , 2019, ArXiv.

[12]  Fabian-Robert Stöter,et al.  MUSDB18-HQ - an uncompressed version of MUSDB18 , 2019 .

[13]  Oriol Vinyals,et al.  Neural Discrete Representation Learning , 2017, NIPS.

[14]  Fabian-Robert Stöter,et al.  Open-Unmix - A Reference Implementation for Music Source Separation , 2019, J. Open Source Softw..

[15]  Stefan Uhlich,et al.  All For One And One For All: Improving Music Separation By Bridging Networks , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[16]  Myung Jong Kim,et al.  Cross-acoustic transfer learning for sound event classification , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[17]  Daniel P Ferris,et al.  EEG correlates of sensorimotor processing: independent components involved in sensory and motor processing , 2017, Scientific Reports.

[18]  Ilya Sutskever,et al.  Jukebox: A Generative Model for Music , 2020, ArXiv.