Singing information processing

This paper introduces singing information processing, which is defined as music information processing for singing voices. As many people listen to music with a focus on singing, singing is one of the most important elements of music. Singing information processing attracts attention not only from a scientific point of view but also from the standpoint of commercial applications, such as singing synthesis, automatic singing pitch correction, query-by-humming, and singing skill evaluation for karaoke. The concept of singing information processing is broad and still emerging.

[1]  Ichiro Fujinaga,et al.  Beatbox Classification Using ACE , 2005, ISMIR.

[2]  J. Sundberg Articulatory interpretation of the "singing formant". , 1974, The Journal of the Acoustical Society of America.

[3]  Yoichi Muraoka,et al.  A WWW-based Melody Retrieval System , 1998, ICMC.

[4]  D. Schwarz,et al.  Corpus-Based Concatenative Synthesis , 2007, IEEE Signal Processing Magazine.

[5]  Hiromasa Fujihara,et al.  Singer Identification Based on Accompaniment Sound Reduction and Reliable Frame Selection , 2005, ISMIR.

[6]  Hiromasa Fujihara,et al.  Integrating Additional Chord Information Into HMM-Based Lyrics-to-Audio Alignment , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Tomoki Toda,et al.  Singing voice conversion method based on many-to-many eigenvoice conversion and training data generation using a singing-to-singing synthesis system , 2012, Proceedings of The 2012 Asia Pacific Signal and Information Processing Association Annual Summit and Conference.

[8]  Masataka Goto,et al.  VocaRe fi ner : An Interactive Singing Recording System with Integration of Multiple Singing Recordings , 2013 .

[9]  Masataka Goto,et al.  Analysis and Automatic Detection of Breath Sounds in Unaccompanied Singing Voice , 2008 .

[10]  J. Bonada,et al.  Synthesis of the Singing Voice by Performance Sampling and Spectral Models , 2007, IEEE Signal Processing Magazine.

[11]  Haizhou Li,et al.  Exploring Vibrato-Motivated Acoustic Features for Singer Identification , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[12]  Frederick Husler,et al.  Singing: The Physical Nature of the Vocal Organ. A Guide to the Unlocking of the Singing Voice , 1965 .

[13]  Kin Hong Wong,et al.  Automatic lyrics alignment for Cantonese popular music , 2006, Multimedia Systems.

[14]  Ren-Yuan Lyu,et al.  An automatic singing transcription system with multilingual singing lyric recognizer and robust melody tracker , 2003, INTERSPEECH.

[15]  Pedro Cano,et al.  Low-Delay Singing Voice Alignment to Text , 1999, ICMC.

[16]  Ye Wang,et al.  LyricAlly: Automatic Synchronization of Textual Lyrics to Acoustic Music Signals , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[17]  Jordi Bonada,et al.  Applying voice conversion to concatenative singing-voice synthesis , 2010, INTERSPEECH.

[18]  Daniel P. W. Ellis,et al.  USING VOICE SEGMENTS TO IMPROVE ARTIST CLASSIFICATION OF MUSIC , 2002 .

[19]  Hideki Kawahara,et al.  v.morish'09: A Morphing-Based Singing Design Interface for Vocal Melodies , 2009, ICEC.

[20]  Wei-Ho Tsai,et al.  An FFT-based fast melody comparison method for query-by-singing/humming systems , 2012, Pattern Recognit. Lett..

[21]  Tomoki Toda,et al.  An investigation of acoustic features for singing voice conversion based on perceptual age , 2013, INTERSPEECH.

[22]  Meinard Müller,et al.  Lyrics-Based Audio Retrieval and Multimodal Navigation in Music Collections , 2007, ECDL.

[23]  Akinori Ito,et al.  Novel Tonal Feature and Statistical User Modeling for Query-by-Humming , 2009, J. Inf. Process..

[24]  Jordi Bonada,et al.  Generating Singing Voice Expression Contours Based on Unit Selection , 2013 .

[25]  Wilmer T. Bartholomew A Physical Definition of “Good Voice‐Quality” in the Male Voice , 1934 .

[26]  Ajay Kapur,et al.  Query-by-Beat-Boxing: Music Retrieval For The DJ , 2004, ISMIR.

[27]  Hideaki Takeda,et al.  Network analysis of massively collaborative creation of multimedia contents: case study of hatsune miku videos on nico nico douga , 2008, UXTV '08.

[28]  Perry R. Cook,et al.  Singing Voice Synthesis: History, Current Work, and Future Directions , 1996 .

[29]  Hiromasa Fujihara,et al.  Three techniques for improving automatic synchronization between music and lyrics: Fricative detection, filler model, and novel feature vectors for vocal activity detection , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[30]  Andreas Nürnberger,et al.  Towards Query by Singing/Humming on Audio Databases , 2007, ISMIR.

[31]  D. Deutsch,et al.  The Psychology of Music , 1983 .

[32]  Masataka Goto Frontiers of music information research based on signal processing , 2014, 2014 12th International Conference on Signal Processing (ICSP).

[33]  Shuuji Kajita,et al.  Cybernetic human HRP-4C , 2009, 2009 9th IEEE-RAS International Conference on Humanoid Robots.

[34]  Masataka Goto,et al.  MiruSinger: A Singing Skill Visualization Interface Using Real-Time Feedback and Music CD Recordings as Referential Data , 2007, Ninth IEEE International Symposium on Multimedia Workshops (ISMW 2007).

[35]  Masataka Goto,et al.  VOCALISTENER: A SINGING-TO-SINGING SYNTHESIS SYSTEM BASED ON ITERATIVE PARAMETER ESTIMATION , 2009 .

[36]  Makoto Tachibana,et al.  A singing style modeling system for singing voice synthesizers , 2010, INTERSPEECH.

[37]  Hiromasa Fujihara,et al.  Singing information processing based on singing voice modeling , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[38]  Hiromasa Fujihara,et al.  LyricSynchronizer: Automatic Synchronization System Between Musical Audio Signals and Lyrics , 2011, IEEE Journal of Selected Topics in Signal Processing.

[39]  Masataka Goto,et al.  Voice Drummer : A Music Notation Interface of Drum Sounds Using Voice Percussion Input , 2005 .

[40]  Tomoki Toda,et al.  Evaluation of a singing voice conversion method based on many-to-many eigenvoice conversion , 2013, INTERSPEECH.

[41]  Hiromasa Fujihara,et al.  Automatic Synchronization between Lyrics and Music CD Recordings Based on Viterbi Alignment of Segregated Vocal Signals , 2006, Eighth IEEE International Symposium on Multimedia (ISM'06).

[42]  O. Gillet INDEXING AND QUERYING DRUM LOOPS DATABASES , 2005 .

[43]  Masataka Goto,et al.  Vocalistener2: A singing synthesis system able to mimic a user's singing in terms of voice timbre changes as well as pitch and dynamics , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[44]  Masataka Goto,et al.  Grand Challenges in Music Information Research , 2012, Multimodal Music Processing.

[45]  Heiga Zen,et al.  An HMM-based singing voice synthesis system , 2006, INTERSPEECH.

[46]  Brian Christopher Smith,et al.  Query by humming: musical information retrieval in an audio database , 1995, MULTIMEDIA '95.

[47]  Masataka Goto,et al.  Acoustic and perceptual effects of vocal training in amateur male singing , 2009, INTERSPEECH.

[48]  Hideki Kawahara,et al.  Scat singing generation using a versatile speech manipulation system, STRAIGHT , 2001 .

[49]  Hsin-Min Wang,et al.  Blind Clustering of Popular Music Recordings Based on Singer Voice Characteristics , 2004, Computer Music Journal.

[50]  Ingo R. Titze,et al.  Principles of voice production , 1994 .

[51]  Yosuke Takashima,et al.  Melody Retrieval with Humming , 1993, ICMC.

[52]  Bryan Pardo,et al.  A Query by Humming System that Learns from Experience , 2007, ISMIR.

[53]  Akinori Ito,et al.  Music Information Retrieval from a Singing Voice Using Lyrics and Melody Information , 2007, EURASIP J. Adv. Signal Process..

[54]  Peter Desain,et al.  Development of real-time visual feedback assistance in singing training: a review , 2006, J. Comput. Assist. Learn..

[55]  Xavier Rodet,et al.  A Virtual Castrato (!?) , 1994, ICMC.

[56]  George Tzanetakis,et al.  A comparative evaluation of search techniques for query-by-humming using the MUSART testbed , 2007 .

[57]  Gaël Richard,et al.  Drum Loops Retrieval from Spoken Queries , 2005, Journal of Intelligent Information Systems.

[58]  Yongwei Zhu,et al.  Popular song and lyrics synchronization and its application to music information retrieval , 2006, Electronic Imaging.

[59]  Akinori Ito,et al.  Novel Tonal Feature and Statistical User Modeling for Query-by-Humming , 2009 .

[60]  Perry R. Cook,et al.  Identification Of Control Parameters In An Articulatory Vocal Tract Model, With Applications To The Synthesis Of Singing , 1990 .

[61]  Hideki Kawahara,et al.  Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds , 1999, Speech Commun..

[62]  Masataka Goto,et al.  Speech-to-Singing Synthesis: Converting Speaking Voices to Singing Voices by Controlling Acoustic Features Unique to Singing Voices , 2007, 2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[63]  Anssi Klapuri,et al.  Singer Identification in Polyphonic Music Using Vocal Separation and Pattern Recognition Methods , 2007, ISMIR.

[64]  Hsin-Min Wang,et al.  Automatic Identification of the Sung Language in Popular Music Recordings , 2007 .

[65]  Hideki Kawahara,et al.  Temporally variable multi-aspect N-way morphing based on interference-free speech representations , 2013, 2013 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference.

[66]  Tomoki Toda,et al.  Regression approaches to perceptual age control in singing voice conversion , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[67]  Johan Sundberg,et al.  The KTH Synthesis of Singing , 2006 .

[68]  Masataka Goto,et al.  An Automatic Singing Impression Estimation Method Using Factor Analysis and Multiple Regression , 2014, ICMC.

[69]  Masataka Goto,et al.  An auto-regressive, non-stationary excited signal parameter estimation method and an evaluation of a singing-voice recognition , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[70]  Jordi Bonada,et al.  Performance-driven control for sample-based singing voice synthesis , 2006 .

[71]  Amaury Hazan Towards automatic transcription of expressive oral percussive performances , 2005, IUI '05.

[72]  Changsheng Xu,et al.  Singer identification based on vocal and instrumental models , 2004, ICPR 2004.

[73]  Kazuhito Yokoi,et al.  VocaListener and VocaWatcher: Imitating a human singer by using signal processing , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[74]  Masataka Goto,et al.  Vocal timbre analysis using latent Dirichlet allocation and cross-gender vocal timbre similarity , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[75]  Christian Dittmar,et al.  Phoneme Recognition in Popular Music , 2007, ISMIR.

[76]  Mark A. Bartsch,et al.  Automatic singer identification in polyphonic music. , 2004 .

[77]  Hideki Kenmochi,et al.  VOCALOID - commercial singing synthesizer based on sample concatenation , 2007, INTERSPEECH.

[78]  Youngmoo E. Kim,et al.  Singer Identification in Popular Music Recordings Using Voice Coding Features , 2002 .

[79]  Kazuhito Yokoi,et al.  VocaWatcher: Natural singing motion generator for a humanoid robot , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[80]  Takashi Nose,et al.  A style control technique for singing voice synthesis based on multiple-regression HSMM , 2013, INTERSPEECH.

[81]  Haizhou Li,et al.  Syllabic level automatic synchronization of music signals and text lyrics , 2006, MM '06.

[82]  J. Sundberg,et al.  The Science of Singing Voice , 1987 .

[83]  Masataka Goto,et al.  An automatic singing skill evaluation method for unknown melodies using pitch interval accuracy and vibrato features , 2006, INTERSPEECH.

[84]  Hiromasa Fujihara,et al.  A Modeling of Singing Voice Robust to Accompaniment Sounds and Its Application to Singer Identification and Vocal-Timbre-Similarity-Based Music Information Retrieval , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[85]  Steve Lawrence,et al.  Artist detection in music with Minnowmatch , 2001, Neural Networks for Signal Processing XI: Proceedings of the 2001 IEEE Signal Processing Society Workshop (IEEE Cat. No.01TH8584).

[86]  Akinori Ito,et al.  A System for Evaluating Singing Enthusiasm for Karaoke , 2011, ISMIR.

[87]  Masataka Goto,et al.  Recent studies on music information processing , 2004 .

[88]  Hsin-Min Wang,et al.  Automatic singer recognition of popular music recordings via estimation and modeling of solo vocal signals , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[89]  David M. Howard,et al.  MICROCOMPUTER-BASED SINGING ABILITY ASSESSMENT AND DEVELOPMENT , 1989 .

[90]  Daniel P. W. Ellis,et al.  Leveraging repetition for improved automatic lyric transcription in popular music , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[91]  Hiromasa Fujihara,et al.  Hyperlinking Lyrics: A Method for Creating Hyperlinks Between Phrases in Song Lyrics , 2008, ISMIR.

[92]  Hiromasa Fujihara,et al.  A Music Information Retrieval System Based on Singing Voice Timbre , 2007, ISMIR.

[93]  Dima Ruinskiy,et al.  An Effective Algorithm for Automatic Detection and Exact Demarcation of Breath Sounds in Speech and Song Signals , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[94]  Jyh-Shing Roger Jang,et al.  A General Framework of Progressive Filtering and Its Application to Query by Singing/Humming , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[95]  Kian-Lee Tan,et al.  Towards efficient automated singer identification in large music databases , 2006, SIGIR.

[96]  Masataka Goto,et al.  A Stochastic Representation of the Dynamics of Sung Melody , 2007, ISMIR.

[97]  Elaine Chew,et al.  Challenging Uncertainty in Query by Humming Systems: A Fingerprinting Approach , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[98]  Partha Lal A comparison of singing evaluation algorithms , 2006, INTERSPEECH.

[99]  Anssi Klapuri,et al.  Signal Processing Methods for Music Transcription , 2006 .

[100]  Tong Zhang,et al.  Automatic singer identification , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).