Speech, Sound and Music Processing: Embracing Research in India

Important aspects of singing ability include musical accuracy and voice quality. In the context of Indian classical music, not only is the correct sequence of notes important to musical accuracy but also the nature of pitch transitions between notes. These transitions are essentially related to gamakas (ornaments) that are important to the aesthetics of the genre. Thus a higher level of singing skill involves achieving the necessary expressiveness via correct rendering of ornamentation, and this ability can serve to distinguish a welltrained singer from an amateur. We explore objective methods to assess the quality of ornamentation rendered by a singer with reference to a model rendition of the same song. Methods are proposed for the perceptually relevant comparison of complex pitch movements based on cognitively salient features of the pitch contour shape. The objective measurements are validated via their observed correlation with subjective ratings by human experts. Such an objective assessment system can serve as a useful feedback tool in the training of amateur singers.

[1]  Jody Kreiman,et al.  Perception of aperiodicity in pathological voice. , 2005, The Journal of the Acoustical Society of America.

[2]  Ciara Leydon,et al.  The role of auditory feedback in sustaining vocal vibrato. , 2003, The Journal of the Acoustical Society of America.

[3]  C. Larson,et al.  Voice F0 responses to pitch-shifted auditory feedback: a preliminary study. , 1997, Journal of voice : official journal of the Voice Foundation.

[4]  Mark A. Clements,et al.  Concatenation-Based MIDI-to-Singing Voice Synthesis , 1997 .

[5]  L. Robles,et al.  Mechanics of the mammalian cochlea. , 2001, Physiological reviews.

[6]  Wen-Hsing Lai F0 Control Model for Mandarin Singing Voice Synthesis , 2007, 2007 Second International Conference on Digital Telecommunications (ICDT'07).

[7]  Thomas Shipp Science of the Singing Voice, Johan Sundberg, Northern Illinois University Press, De Kalb, 216 pp., $14.00. , 1990 .

[8]  Jean Schoentgen Estimation of the modulation frequency and modulation depth of the fundamental frequency owing to vocal micro-tremor of the voice source signal , 2001, INTERSPEECH.

[9]  Ted Painter,et al.  Audio Signal Processing and Coding , 2007 .

[10]  Cham Athwal,et al.  The Humanisation of Stochastic Processes for the Modelling of F0 Drift in Singing , 2011 .

[11]  Hideki Kenmochi,et al.  VOCALOID - commercial singing synthesizer based on sample concatenation , 2007, INTERSPEECH.

[12]  I. Stravinsky,et al.  SYNTHESIS AND PROCESSING OF THE SINGING VOICE , 2002 .

[13]  D. Klatt,et al.  Analysis, synthesis, and perception of voice quality variations among female and male talkers. , 1990, The Journal of the Acoustical Society of America.

[14]  Masataka Goto,et al.  VOCALISTENER: A SINGING-TO-SINGING SYNTHESIS SYSTEM BASED ON ITERATIVE PARAMETER ESTIMATION , 2009 .

[15]  Ian Williams,et al.  Perceptually Relevant Models for Articulation in Synthesised Drum Patterns , 2011 .

[16]  Xuejing Sun,et al.  Pitch determination and voice quality analysis using Subharmonic-to-Harmonic Ratio , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[17]  D R Soderquist,et al.  Backward, simultaneous, and forward masking as a function of signal delay and frequency. , 1981, The Journal of auditory research.

[18]  Cham Athwal,et al.  Towards a Model for the Humanisation of Pitch Drift in Singing Voice Synthesis , 2011, ICMC.

[19]  T. Houtgast,et al.  Intensity discrimination of Gaussian-windowed tones: indications for the shape of the auditory frequency-time window. , 1999, The Journal of the Acoustical Society of America.

[20]  J. Bonada,et al.  Synthesis of the Singing Voice by Performance Sampling and Spectral Models , 2007, IEEE Signal Processing Magazine.

[21]  Marc Schröder,et al.  Emotional speech synthesis: a review , 2001, INTERSPEECH.

[22]  Masashi Unoki,et al.  Development of an F0 control model based on F0 dynamic characteristics for singing-voice synthesis , 2005, Speech Commun..

[23]  Johan Sundberg,et al.  The effect of delayed auditory feedback on vocal vibrato , 1988 .

[24]  R. Orlikoff,et al.  Fundamental frequency modulation of the human voice by the heartbeat: preliminary results and possible mechanisms. , 1989, The Journal of the Acoustical Society of America.

[25]  W. Bastiaan Kleijn,et al.  Exploiting time and frequency masking in consistent sinusoidal analysis-synthesis , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[26]  Jordi Janer,et al.  TRANSFORMING SINGING VOICE EXPRESSION - THE SWEETNESS EFFECT , 2004 .

[27]  Ernst Terhardt,et al.  Calculating virtual pitch , 1979, Hearing Research.

[28]  D. Ruinskiy,et al.  Stochastic models of pitch jitter and amplitude shimmer for voice modification , 2008, 2008 IEEE 25th Convention of Electrical and Electronics Engineers in Israel.

[29]  E. Thomas Doherty,et al.  Physiologic factors in vocal vibrato , 1988 .

[30]  Perry R. Cook,et al.  Identification Of Control Parameters In An Articulatory Vocal Tract Model, With Applications To The Synthesis Of Singing , 1990 .

[31]  Jelena Kovacevic,et al.  Wavelets and Subband Coding , 2013, Prentice Hall Signal Processing Series.

[32]  T A Burnett,et al.  Comparison of voice F0 responses to pitch-shift onset and offset conditions. , 2001, The Journal of the Acoustical Society of America.

[33]  Alex Loscos Spectral processing of the singing voice , 2007 .