Tapestrea: techniques and paradigms for expressive synthesis, transformation, and re-composition of environmental audio

TAPESTREA is a sound design and composition framework that facilitates the creation of new sound from existing digital audio recordings, through interactive analysis, transformation and re-synthesis. During analysis, sinusoidal modeling and transient detection techniques are used to parametrically extract desired sound templates of different types: sinusoidal events, transient events, and stochastic background. Each extracted template is transformed and synthesized independently using an appropriate technique, such as sinusoidal re-synthesis or wavelet tree learning. This allows specialized transformations on each template based on its type; sinusoidal templates undergo real-time, large-scale time and frequency transformations, while background is generated parametrically from extracted samples. The user interacts with TAPESTREA via a set of graphical interfaces. Synthesis is further controlled through ChucK scripts, which allow simultaneous, precise manipulation of many parameters. They also allow control via external input devices and user-defined GUI elements. These combined techniques form a workbench for completely transforming a sound scene, dynamically generating soundscapes, or creating musical tapestries by weaving together transformed elements from different recordings. Thus, TAPESTREA introduces a new paradigm for composition, sound design, and other sonic sculpting tasks. Work on further improving the system includes user studies to compare alternative algorithms for generating stochastic background noise.

[1]  Georg Essl Circle Maps as Simple Oscillators for Complex Behavior: I. Basics , 2006, ICMC.

[2]  Dinesh K. Pai,et al.  JASS: A JAVA AUDIO SYNTHESIS SYSTEM FOR PROGRAMMERS , 2001 .

[3]  Perry R. Cook,et al.  Freedom in Tapestrea! Voice-Aware Track Manipulations , 2009, ICMC.

[4]  Perry R. Cook,et al.  Sndtools: Real-Time audio DSP and 3D Visualization , 2005, ICMC.

[5]  Perry R. Cook,et al.  TAPESTREA: sound scene modeling by example , 2006, SIGGRAPH '06.

[6]  Perry R. Cook,et al.  REAL-TIME DISSONANCIZERS: TWO DISSONANCE-AUGMENTING AUDIO EFFECTS , 2008 .

[7]  Roger B. Dannenberg Abstract Time Warping of Compound Events and Signals , 1994, ICMC.

[8]  Dinesh K. Pai,et al.  Manipulation and Resynthesis with Natural Grains , 2001, ICMC.

[9]  Davide Rocchesso,et al.  Sound and Music Computing: Research Trends and Some Key Issues , 2007 .

[10]  Jonathan Berger,et al.  SONART : THE SONIFICATION APPLICATION RESEARCH TOOLBOX , 2002 .

[11]  Ge Wang,et al.  The chuck audio programming language. a strongly-timed and on-the-fly environ/mentality , 2008 .

[12]  Julius O. Smith Waveguide Simulation of Non-Cylindrical Acoustic Tubes , 1991, ICMC.

[13]  Alex Loscos,et al.  Emulating Rough And Growl Voice In Spectral Domain , 2004 .

[14]  Perry R. Cook,et al.  Toward Synthesized Environments: A Survey of Analysis and Synthesis Methods for Sound Designers and Composers , 2009, ICMC.

[15]  Kevin Karplus,et al.  Digital Synthesis of Plucked-String and Drum Timbers , 1983 .

[16]  Teresa H. Y. Meng,et al.  An analysis/synthesis tool for transient signals that allows a flexible sines+transients+noise model for audio , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[17]  Barry Truax Chaotic Non-linear Systems and Digital Synthesis: An Exploratory Study , 1990, ICMC.

[18]  Paul Boersma,et al.  Praat: doing phonetics by computer , 2003 .

[19]  Barry Vercoe,et al.  Structured audio: creation, transmission, and rendering of parametric sound representations , 1998, Proc. IEEE.

[20]  Georg Essl MATHEMATICAL STRUCTURE AND SOUND SYNTHESIS , 2005 .

[21]  Julius O. Smith,et al.  The Second-Order Digital Waveguide Oscillator , 1992, ICMC.

[22]  Deepen Sinha,et al.  Low bit rate transparent audio compression using adapted wavelets , 1993, IEEE Trans. Signal Process..

[23]  Tapio Takala,et al.  Sound rendering , 1992, SIGGRAPH.

[24]  Julius O. Smith,et al.  Music applications of digital waveguides , 1987 .

[25]  James F. O'Brien,et al.  Synthesizing Sounds from Physically Based Motion , 2001, SIGGRAPH Video Review on Animation Theater Program.

[26]  Yuan Qi,et al.  Bayesian spectrum estimation of unevenly sampled nonstationary data , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[27]  Andy Farnell,et al.  Designing Sound , 2008 .

[28]  Perry R. Cook,et al.  Physically Informed Sonic Modeling (PhISM): Percussive Synthesis , 1996, International Conference on Mathematics and Computing.

[29]  R. M. Schafer,et al.  The tuning of the world , 1977 .

[30]  Gregory Kramer,et al.  Auditory Display: Sonification, Audification, And Auditory Interfaces , 1994 .

[31]  Matthew Wright,et al.  Open SoundControl: A New Protocol for Communicating with Sound Synthesizers , 1997, ICMC.

[32]  Perry R. Cook,et al.  Real Sound Synthesis for Interactive Applications , 2002 .

[33]  L. H. Anauer,et al.  Speech Analysis and Synthesis by Linear Prediction of the Speech Wave , 2000 .

[34]  David Wessel,et al.  Audio Applications of the Sound Description Interchange Format Standard , 1999 .

[35]  Lawrence A. Rowe,et al.  OpenSoundEdit: An Interactive Visualization and Editing Framework for Timbral Resources , 1998, ICMC.

[36]  Davide Rocchesso,et al.  Physical modeling of membranes for percussion instruments , 1998 .

[37]  Myriam Desainte-Catherine,et al.  ADAPTING THE OVERLAP-ADD METHOD TO THE SYNTHESIS OF NOISE , 2002 .

[38]  X. Rodet Time — Domain Formant — Wave — Function Synthesis , 1984 .

[39]  R. J. McAulay,et al.  Speech transformations based on a sinusoidal representation , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[40]  Perry R. Cook,et al.  Singing Voice Synthesis: History, Current Work, and Future Directions , 1996 .

[41]  David M. Blei,et al.  FINDING LATENT SOURCES IN RECORDED MUSIC WITH A SHIFT-INVARIANT HDP , 2009 .

[42]  Mark B. Sandler,et al.  A tutorial on onset detection in music signals , 2005, IEEE Transactions on Speech and Audio Processing.

[43]  Jonathan Berger,et al.  Application of Image Sonification Methods to Music , 2005, ICMC.

[44]  Perry R. Cook,et al.  The Laptop Orchestra as Classroom , 2008, Computer Music Journal.

[45]  Mark B. Sandler,et al.  The Sonic Visualiser: A Visualisation Platform for Semantic Descriptors from Musical Signals , 2006, ISMIR.

[46]  Jośe R. Beltŕan,et al.  ADDITIVE SYNTHESIS BASED ON THE CONTINUOUS WAVELET TRANSFORM: A SINUSOIDAL PLUS TRANSIENT MODEL , 2003 .

[47]  Pierre Ruiz,et al.  Synthesizing Musical Sounds by Solving the Wave Equation for Vibrating Objects: Part 2 , 1971 .

[48]  Mark Dolson,et al.  The Phase Vocoder: A Tutorial , 1986 .

[49]  Jean-Francois Cardoso,et al.  Blind signal separation: statistical principles , 1998, Proc. IEEE.

[50]  Jordi Janer,et al.  TRANSFORMING SINGING VOICE EXPRESSION - THE SWEETNESS EFFECT , 2004 .

[51]  Davide Rocchesso,et al.  Physically-based Sounding Objects, as We Develop Them Today , 2004 .

[52]  Andy Hunt,et al.  A Toolkit for Interactive Sonification , 2004, ICAD.

[53]  Perry R. Cook,et al.  Composing for Laptop Orchestra , 2008, Computer Music Journal.

[54]  Dan Trueman,et al.  Why a laptop orchestra? , 2007, Organised Sound.

[55]  George Tzanetakis,et al.  Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[56]  James D. Johnston,et al.  Transform coding of audio signals using perceptual noise criteria , 1988, IEEE J. Sel. Areas Commun..

[57]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[58]  Perry R. Cook,et al.  Physically Informed Sonic Modeling (PhISM): Synthesis of percussive sounds , 1997 .

[59]  Aniket Kittur,et al.  Crowdsourcing user studies with Mechanical Turk , 2008, CHI.

[60]  Daniel P. W. Ellis,et al.  Sound texture modelling with linear prediction in both time and frequency domains , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[61]  Simon J. Godsill,et al.  A Bayesian Approach for Blind Separation of Sparse Sources , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[62]  Michael Klingbeil,et al.  Software for spectral Analysis, Editing, and synthesis , 2005, ICMC.

[63]  Stefania Serafin,et al.  Theory of Banded Waveguides , 2004, Computer Music Journal.

[64]  John M. Chowning,et al.  The Synthesis of Complex Audio Spectra by Means of Frequency Modulation , 1973 .

[65]  Daniel Arfib Digital Synthesis of Complex Spectra by means of Multiplication of Non-linear Distorted Sine Waves , 1978, ICMC.

[66]  K. Steiglitz,et al.  Synthesis of timbral families by warped linear prediction , 1981, ICASSP.

[67]  Kelly Raymond Fitz,et al.  The Reassigned Bandwidth-Enhanced Method of Additive Synthesis , 1999 .

[68]  Dan Stowell,et al.  Adaptive whitening for Improved Real-Time audio onset Detection , 2007, ICMC.

[69]  George Tzanetakis,et al.  MARSYAS: a framework for audio analysis , 1999, Organised Sound.

[70]  Kathy Melih,et al.  Source segmentation for structured audio , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[71]  Perry R. Cook,et al.  Musical Tapestry: Re-composing Natural Sounds† , 2007, ICMC.

[72]  Bob L. Sturm,et al.  Analysis, Visualization, and Transformation of Audio Signals Using Dictionary-based Methods , 2009, ICMC.

[73]  Matti Karjalainen,et al.  Evaluation of Modern Sound Synthesis Methods , 1998 .

[74]  Perry R. Cook,et al.  Towards Automatic Musical Instrument Timbre Recognition , 2010 .

[75]  Lippold Haken,et al.  Sound Morphing using Loris and the Reassigned Bandwidth-Enhanced Additive Sound Model: Practice and Applications , 2002, ICMC.

[76]  Dani Lischinski,et al.  Synthesizing Sound Textures through Wavelet Tree Learning , 2002, IEEE Computer Graphics and Applications.

[77]  Barry Truax,et al.  Genres and techniques of soundscape composition as developed at Simon Fraser University , 2002, Organised Sound.

[78]  Andrew Horner,et al.  Fast sound Texture synthesis using Overlap-Add , 2007, ICMC.

[79]  Perry R. Cook,et al.  Modeling Bill's Gait: Analysis and Parametric Synthesis of Walking Sounds , 2002 .

[80]  John M. Chowning Frequency modulation synthesis of the singing voice , 1989 .

[81]  Guy J. Brown,et al.  Computational auditory scene analysis , 1994, Comput. Speech Lang..

[82]  Julius O. Smith,et al.  Efficient Simulation of the Reed-Bore and Bow-String Mechanisms , 1986, ICMC.

[83]  J. Makhoul,et al.  Linear prediction: A tutorial review , 1975, Proceedings of the IEEE.

[84]  D. Schwarz,et al.  Corpus-Based Concatenative Synthesis , 2007, IEEE Signal Processing Magazine.

[85]  Roger B. Dannenberg,et al.  A Taxonomy of Computer Music , 1994 .

[86]  L. Wyse,et al.  SOUND TEXTURE MODELING AND TIME-FREQUENCY LPC , 2004 .

[87]  Paul Rudy Spectromorphology Hits Hollywood: Black Hawk Down-A Case Study , 2004, ICMC.

[88]  Pau Arumí,et al.  Developing Cross-Platform audio and Music Applications with the CLAM Framework , 2005, ICMC.

[89]  Jon Dattorro,et al.  Effect design. Part 3: Oscillators: Sinusoidal and pseudonoise , 2002 .

[90]  Daniel P. W. Ellis,et al.  A computer implementation of psychoacoustic grouping rules , 1993, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 2 - Conference B: Computer Vision & Image Processing. (Cat. No.94CH3440-5).

[91]  Perry R. Cook,et al.  The Audicle: A Context-Sensitive, On-the-fly Audio Programming Environ/mentality , 2004, ICMC.

[92]  Leah H. Jamieson,et al.  High-quality audio compression using an adaptive wavelet packet decomposition and psychoacoustic modeling , 1998, IEEE Trans. Signal Process..

[93]  Julius O. Smith,et al.  A Sines+Transients+Noise Audio Representation for Data Compression and Time/Pitch Scale Modifications , 1998 .

[94]  Roger B. Dannenberg,et al.  An Intelligent Multi-Track audio Editor , 2007, ICMC.

[95]  Roger B. Dannenberg,et al.  A Fast Data Structure for Disk-Based Audio Editing , 2002, Computer Music Journal.

[96]  Perry R. Cook,et al.  ChucK: A Concurrent, On-the-fly, Audio Programming Language , 2003, ICMC.

[97]  Julius O. Smith,et al.  Extensions of the Karplus-Strong Plucked-String Algorithm , 1983 .

[98]  Gary P. Scavone RtAudio: A Cross-Platform C++ Class for Realtime Audio Input/Output , 2002, ICMC.

[99]  Phil Clendeninn The Vocoder , 1940, Nature.

[100]  David Birchfield,et al.  Design of a Generative Model for Soundscape Creation , 2005, ICMC.

[101]  Georg Essl CIRCLE MAPS AS A SIMPLE OSCILLATORS FOR COMPLEX BEHAVIOR: II. EXPERIMENTS , 2006 .

[102]  Daniel P. W. Ellis,et al.  A Perceptual Representation of Sound for Auditory Signal Separation , 1992 .

[103]  Perry R. Cook,et al.  The Synthesis ToolKit (STK) , 1999, ICMC.

[104]  Thomas P. Caudell,et al.  A Wavelet Synthesis Technique for Creating Realistic Virtual Environment Sounds , 2002, Presence: Teleoperators & Virtual Environments.

[105]  Perry R. Cook,et al.  Feature-Based Synthesis: Mapping Acoustic and Perceptual Features onto Synthesis Parameters , 2006, ICMC.

[106]  Julius O. Smith,et al.  Waveguide Filter Tutorial , 1987, ICMC.

[107]  Nick Collins,et al.  Errant sound synthesis , 2008, ICMC.

[108]  Andrew Horner,et al.  Sound Texture Synthesis Using an Overlap–Add/Granular Synthesis Approach , 2009 .

[109]  John R. Pierce,et al.  A passive nonlinear digital filter design which facilitates physics-based sound synthesis of highly nonlinear musical instruments , 1995 .

[110]  Stephen W. Hainsworth,et al.  Techniques for the Automated Analysis of Musical Audio , 2004 .

[111]  Perry R. Cook,et al.  miniAudicle and ChucK Shell: New Interfaces for ChucK Development and Performance , 2006, ICMC.

[112]  Marc Le Brun,et al.  Digital Waveshaping Synthesis , 1979 .

[113]  Diemo Schwarz Concatenative sound synthesis: The early years , 2006 .

[114]  S. Van Duyne,et al.  The 2-D digital waveguide mesh , 1993, Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[115]  Julius O. Smith,et al.  Viewpoints on the History of Digital Synthesis , 1991, ICMC.

[116]  Perry R. Cook,et al.  Don't forget the laptop: using native input capabilities for expressive musical control , 2007, NIME '07.

[117]  Dinesh K. Pai,et al.  Physically-based Sound Eects for Interactive Simulation and Animation , 2001 .

[118]  Marc Leman,et al.  Content-Based Music Information Retrieval: Current Directions and Future Challenges , 2008, Proceedings of the IEEE.

[119]  Lippold Haken,et al.  Sinusoidal Modeling and Manipulation Using Lemur , 1996 .

[120]  Barry Truax,et al.  Composing with Real-Time Granular Sound , 1990 .

[121]  Jonas Beskow,et al.  Wavesurfer - an open source speech tool , 2000, INTERSPEECH.

[122]  Thomas F. Quatieri,et al.  Speech analysis/Synthesis based on a sinusoidal representation , 1986, IEEE Trans. Acoust. Speech Signal Process..

[123]  Xavier Serra,et al.  A system for sound analysis/transformation/synthesis based on a deterministic plus stochastic decomposition , 1989 .

[124]  Agostino Di Scipio Composition by Exploration of Non-linear Dynamic Systems , 1990, ICMC.

[125]  X. Rodet,et al.  Sound Analysis and Processing with AudioSculpt 2 , 2004, ICMC.

[126]  Perry R. Cook,et al.  A NEW PARADIGM FOR SOUND DESIGN , 2006 .

[127]  Paul Lansky,et al.  Compositional applications of linear predictive coding , 1989 .

[128]  Lie Lu,et al.  Audio textures: theory and applications , 2004, IEEE Transactions on Speech and Audio Processing.

[129]  Julius O. Smith,et al.  The 3D Tetrahedral Digital Waveguide Mesh with Musical Applications , 1996, ICMC.

[130]  Anssi Klapuri,et al.  Signal Processing Methods for the Automatic Transcription of Music , 2004 .

[131]  Barry Eaglestone,et al.  Are Cognitive Styles an Important Factor in Design of Electroacoustic Music Software?† , 2007, ICMC.

[132]  Julius O. Smith,et al.  Spectral Modeling Synthesis , 1989, ICMC.

[133]  Perry R. Cook,et al.  MOSIEVIUS: FEATURE DRIVEN INTERACTIVE AUDIO MOSAICING , 2003 .

[134]  Bryan Pardo,et al.  Using Pitch, Amplitude Modulation, and Spatial Cues for Separation of Harmonic Instruments from Stereo Music Recordings , 2007, EURASIP J. Adv. Signal Process..