AI Song Contest: Human-AI Co-Creation in Songwriting

Machine learning is challenging the way we make music. Although research in deep generative models has dramatically improved the capability and fluency of music models, recent work has shown that it can be challenging for humans to partner with this new class of algorithms. In this paper, we present findings on what 13 musician/developer teams, a total of 61 users, needed when co-creating a song with AI, the challenges they faced, and how they leveraged and repurposed existing characteristics of AI to overcome some of these challenges. Many teams adopted modular approaches, such as independently running multiple smaller models that align with the musical building blocks of a song, before re-combining their results. As ML models are not easily steerable, teams also generated massive numbers of samples and curated them post-hoc, or used a range of strategies to direct the generation, or algorithmically ranked the samples. Ultimately, teams not only had to manage the "flare and focus" aspects of the creative process, but also juggle them with a parallel process of exploring and curating multiple ML models and outputs. These findings reflect a need to design machine learning-powered music interfaces that are more decomposable, steerable, interpretable, and adaptive, which in return will enable artists to more effectively explore how AI can extend their personal expression.

[1]  Bill Buxton,et al.  Sketching User Experiences: Getting the Design Right and the Right Design , 2007 .

[2]  A. Adams,et al.  A qualititative approach to HCI research , 2008 .

[3]  Monica Dinculescu,et al.  MidiMe: Personalizing a MusicVAE model with user data , 2019 .

[4]  François Pachet,et al.  Musical Harmonization with Constraints: A Survey , 2004, Constraints.

[5]  Gaetan Hadjeres,et al.  NONOTO: A Model-agnostic Web Interface for Interactive Music Composition by Inpainting , 2019, ICCC.

[6]  Douglas Eck,et al.  Hierarchical Variational Autoencoders for Music , 2017 .

[7]  Colin Raffel,et al.  Learning a Latent Space of Multitrack Measures , 2018, ArXiv.

[8]  Jesse Engel,et al.  Magenta Studio: Augmenting Creativity with Deep Learning in Ableton Live , 2019 .

[9]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[10]  Eric Horvitz,et al.  Principles of mixed-initiative user interfaces , 1999, CHI '99.

[11]  Douglas Eck,et al.  Counterpoint by Convolution , 2019, ISMIR.

[12]  Sungwoo Lee,et al.  I Lead, You Help but Only with Enough Details: Understanding User Experience of Co-Creation with Artificial Intelligence , 2018, CHI.

[13]  Krzysztof Z. Gajos,et al.  ChordRipple: Recommending Chords to Help Novice Composers Go Beyond the Ordinary , 2016, IUI.

[14]  Collectif The OM Composer's book 2 , 2006 .

[15]  Ching-Hua Chuan,et al.  A Functional Taxonomy of Music Generation Systems , 2017, ACM Comput. Surv..

[16]  Jose D. Fernández,et al.  AI Methods in Algorithmic Composition: A Comprehensive Survey , 2013, J. Artif. Intell. Res..

[17]  Yu-Siang Huang,et al.  Pop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions , 2020, ACM Multimedia.

[18]  Zeyu Jin,et al.  Music Creation by Example , 2020, CHI.

[19]  Noah A. Smith,et al.  Creative Writing with a Machine in the Loop: Case Studies on Slogans and Stories , 2018, IUI.

[20]  E. James Whitehead,et al.  Learning to Generate Music With Sentiment , 2021, ISMIR.

[21]  Perry R. Cook,et al.  Real-time human interaction with supervised learning algorithms for music composition and performance , 2011 .

[22]  Antonios Liapis,et al.  Can Computers Foster Human Users’ Creativity? Theory and Praxis of Mixed-Initiative Co-Creativity , 2016 .

[23]  Douglas Eck,et al.  Learning Latent Representations of Music to Generate Interactive Musical Palettes , 2018, IUI Workshops.

[24]  Frank Nielsen,et al.  DeepBach: a Steerable Model for Bach Chorales Generation , 2016, ICML.

[25]  Douglas Eck,et al.  Enabling Factorized Piano Music Modeling and Generation with the MAESTRO Dataset , 2018, ICLR.

[26]  Yoshua Bengio,et al.  SampleRNN: An Unconditional End-to-End Neural Audio Generation Model , 2016, ICLR.

[27]  Dan Morris,et al.  Data-driven exploration of musical chord sequences , 2009, IUI.

[28]  V. Braun,et al.  What can “thematic analysis” offer health and wellbeing researchers? , 2014, International journal of qualitative studies on health and well-being.

[29]  Jean-Pierre Briot From Artificial Neural Networks to Deep Learning for Music Generation - History, Concepts and Trends , 2020, ArXiv.

[30]  Kenneth Sörensen,et al.  Composing fifth species counterpoint music with a variable neighborhood search algorithm , 2013, Expert Syst. Appl..

[31]  Bernd Schöner,et al.  Analysis and Synthesis of Palestrina-Style Counterpoint Using Markov Chains , 2001, ICMC.

[32]  Mark J. Nelson,et al.  Mixed-Initiative Generation of Multi-Channel Sequential Structures , 2018 .

[33]  Bryan Pardo,et al.  Social-EQ: Crowdsourcing an Equalization Descriptor Map , 2013, ISMIR.

[34]  Elaine Chew,et al.  A Hybrid System for Automatic Generation of Style-Specific A ccompaniment , 2007 .

[35]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[36]  V. Braun,et al.  Using thematic analysis in psychology , 2006 .

[37]  Heiga Zen,et al.  WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[38]  Colin Raffel,et al.  A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music , 2018, ICML.

[39]  Krzysztof Z. Gajos,et al.  Active learning of intuitive control knobs for synthesizers using gaussian processes , 2014, IUI.

[40]  Leon Hong,et al.  Approachable Music Composition with Machine Learning at Scale , 2019, ISMIR.

[41]  Dan Morris,et al.  MySong: automatic accompaniment generation for vocal melodies , 2008, CHI.

[42]  Yi Yu,et al.  Conditional LSTM-GAN for Melody Generation from Lyrics , 2019, ACM Trans. Multim. Comput. Commun. Appl..

[43]  W. Bas de Haas,et al.  A functional approach to automatic melody harmonisation , 2013, FARM '13.

[44]  Karen Simonyan,et al.  The challenge of realistic music generation: modelling raw audio at scale , 2018, NeurIPS.

[45]  Sander Dieleman,et al.  Piano Genie , 2018, IUI.

[46]  François Pachet,et al.  Assisted Lead Sheet Composition Using FlowComposer , 2016, CP.

[47]  Douglas Eck,et al.  Music Transformer , 2018, 1809.04281.

[48]  Gaëtan Hadjeres,et al.  Deep Learning Techniques for Music Generation - A Survey , 2017, ArXiv.

[49]  Antonios Liapis,et al.  Mixed-Initiative Creative Interfaces , 2017, CHI Extended Abstracts.

[50]  Masataka Goto,et al.  Chord-Sequence-Factory: A Chord Arrangement System Modifying Factorized Chord Sequence Probabilities , 2013, ISMIR.

[51]  Ilya Sutskever,et al.  Jukebox: A Generative Model for Music , 2020, ArXiv.

[52]  Arne Eigenfeldt,et al.  An Introduction to Musical Metacreation , 2016, CIE.

[53]  Yi-Hsuan Yang,et al.  Pop Music Transformer: Generating Music with Rhythm and Harmony , 2020, ArXiv.

[54]  Cheng-Zhi Anna Huang,et al.  Novice-AI Music Co-Creation via AI-Steering Tools for Deep Generative Models , 2020, CHI.

[55]  Bob L. Sturm,et al.  Machine learning research that matters for music creation: A case study , 2018, Journal of New Music Research.

[56]  Xin Chen,et al.  BandNet: A Neural Network-based, Multi-Instrument Beatles-Style MIDI Music Composition Machine , 2018, ISMIR.

[57]  Geraint A. Wiggins,et al.  AI Methods for Algorithmic Composition: A Survey, a Critical View and Future Prospects , 1999 .

[58]  Monica Dinculescu,et al.  collabdraw: An Environment for Collaborative Sketching with an Artificial Agent , 2019, Creativity & Cognition.