Music Listening in the Future: Augmented Music-Understanding Interfaces and Crowd Music Listening

In the future, music listening can be more active, more immersive, richer, and deeper by using automatic music-understanding technologies (semantic audio analysis). In the first half of this invited talk, four Augmented Music-Understanding Interfaces that facilitate deeper understanding of music are introduced. In our interfaces, visualization of music content and music touch-up (customization) play important roles in augmenting people’s understanding of music because understanding is deepened through seeing and editing. In the second half, a new style of music listening called Crowd Music Listening is discussed. By posting, sharing, and watching time-synchronous comments (semantic information), listeners can enjoy music together with the crowd. Such Internet-based music listening with shared semantic information also helps music understanding because understanding is deepened through communication. Two systems that deal with new trends in music listening — time-synchronous comments and mashup music videos — are finally introduced.

[1]  Masataka Goto,et al.  Drum Sound Recognition for Polyphonic Audio Signals by Adaptation and Matching of Spectrogram Templates With Harmonic Structure Suppression , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[2]  Meinard Müller,et al.  Estimating note intensities in music recordings , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[3]  Masataka Goto,et al.  A real-time music-scene-description system: predominant-F0 estimation for detecting melody and bass lines in real-world audio signals , 2004, Speech Commun..

[4]  Masataka Goto,et al.  A chorus section detection method for musical audio signals and its application to a music listening station , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  Ye Wang,et al.  LyricAlly: Automatic Synchronization of Textual Lyrics to Acoustic Music Signals , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[6]  Masataka Goto,et al.  Parameter Estimation for Harmonic and Inharmonic Models by Using Timbre Feature Distributions , 2009, J. Inf. Process..

[7]  Masataka Goto,et al.  Integration and Adaptation of Harmonic and Inharmonic Models for Separating Polyphonic Musical Signals , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[8]  Masataka Goto,et al.  Dancereproducer: An automatic mashup music video generation system by reusing dance video clips on the web , 2011 .

[9]  Masataka Goto,et al.  An Audio-based Real-time Beat Tracking System for Music With or Without Drum-sounds , 2001 .

[10]  Masataka Goto Augmented Music-Understanding Interfaces , 2009 .

[11]  Hiromasa Fujihara,et al.  Three techniques for improving automatic synchronization between music and lyrics: Fricative detection, filler model, and novel feature vectors for vocal activity detection , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[12]  Hiromasa Fujihara,et al.  Automatic Synchronization between Lyrics and Music CD Recordings Based on Viterbi Alignment of Segregated Vocal Signals , 2006, Eighth IEEE International Symposium on Multimedia (ISM'06).

[13]  Masataka Goto,et al.  SmartMusicKIOSK: music listening station with chorus-search function , 2003, UIST '03.

[14]  Meinard Müller,et al.  Audio-based Music Structure Analysis , 2010 .

[15]  Daisuke Yamamoto iVAS : Web-based Video Annotation System and its Applications , 2004 .

[16]  Masataka Goto Active Music Listening Interfaces Based on Signal Processing , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[17]  Masataka Goto,et al.  Drumix: An Audio Player with Real-time Drum-part Rearrangement Functions for Active Music Listening , 2007 .

[18]  Katsuhiko Kaji MiXA : A Musical Annotation System , 2004 .

[19]  Christian Dittmar,et al.  Phoneme Recognition in Popular Music , 2007, ISMIR.

[20]  Hiromasa Fujihara,et al.  LyricSynchronizer: Automatic Synchronization System Between Musical Audio Signals and Lyrics , 2011, IEEE Journal of Selected Topics in Signal Processing.

[21]  Hideaki Takeda,et al.  Network analysis of massively collaborative creation of multimedia contents: case study of hatsune miku videos on nico nico douga , 2008, UXTV '08.

[22]  Masataka Goto,et al.  MusicCommentator: Generating Comments Synchronized with Musical Audio Signals by a Joint Probabilistic Model of Acoustic and Textual Features , 2009, ICEC.

[23]  Anssi Klapuri,et al.  Drum Sound Detection in Polyphonic Music with Hidden Markov Models , 2009, EURASIP J. Audio Speech Music. Process..

[24]  Lie Lu,et al.  Automatic music video generation based on temporal pattern analysis , 2004, MULTIMEDIA '04.

[25]  Roger B. Dannenberg,et al.  Remixing Stereo Music with Score-Informed Source Separation , 2006, ISMIR.

[26]  Gert R. G. Lanckriet,et al.  Semantic Annotation and Retrieval of Music and Sound Effects , 2008, IEEE Transactions on Audio, Speech, and Language Processing.