ImprovisationBuilder: improvisation as conversation

To participate in musical improvisations, an interacting system must both generate appropriate musical materials and express those materials appropriately in collaboration with other performers. We have used results from the study of conversation and discourse to identify the major components for a theory of improvisation—listening, composing, and realizing. Our goal is a framework based on a model of musical improvisation as conversation that incorporates signal level as well as event level control of sound. 2. BACKGROUND 2.1. Sal-Mar Construction and SAL The Sal-Mar Construction (Franco 1974, Martirano 1971) was an interdisciplinary project involving Salvatore Martirano, computer science graduate student Sergio Franco, and ILLIAC III designers Rich Borovec and James Divilbiss. It was based on the idea of “zoomable” control—being able to apply the same controls at any level, from the micro-structure of individual timbres to the macro-structure of an entire musical composition. Weighing in at 1500 pounds, the Sal-Mar Construction provided digital control over analog synthesis modules through a unique touch panel consisting of banks of switches assignable to any level of control. The YahaSALmaMAC orchestra features MIDI synthesizers under control of the Sound and Logic (SAL) program, which was implemented in LeLisp on the Apple Macintosh by Salvatore Martirano and David Tscheng. Sound and Logic participates in performances by transforming gestures played by the human performers into new gestures. Using the Macintosh keyboard, the human performer can cause the computer to perform looping or change orchestration, intervening in otherwise automated processes (as with the Sal-Mar Construction). A major impetus for the current project was the desire to include timbral control in an improvisation system—a combination of the best features of SAL and the Sal-Mar Construction. 2.2. Kyma System Kyma is a sound specification language that does not distinguish between signal level and event level processing or between the concepts of orchestra and score; these models are supplanted by arbitrary hierarchical structures constructed by the composer out of uniform Sound objects (Scaletti 1987, 1991). Kyma Sounds are generated in real-time by the Capybara, a digital signal multiprocessor. A Macintosh driver for controlling the Capybara enables any program to play a Kyma Sound and to control its parameters in real time. 2.3. Improvisation as Conversation While computers that converse are still years away, systems that interact in musical performances exist today (see, for example, Rowe 1991). Many have used linguistic techniques in the representation of musical knowledge; the logical next step is to consider musical interaction as being analogous to language interaction, or conversation. Hartmann describes one manifestation of this analogy—“trading fours”—in jazz performance (Hartmann 1991): “A four measure phrase leaves the player enough room (say, three to six seconds) to develop one idea, to make one statement; yet there is no mistaking the dialogue within which each statement takes its place, and often the musicians answer each other directly. The resemblance to conversation is uncanny.” Conversation and musical improvisation are at once similar and different. For example, they seem to differ in their degree of simultaneity; improvisation entails that the participants “talk” at the same time, while most models of conversation assume strictly alternating speakers. However, overlap often occurs in real conversations. Much of the simultaneous activity during improvisation is akin to conversational overlap; a pianist accompanying a soloist is like a listener acknowledging and encouraging a speaker. Conversation seems to lacks the strict temporal structure of harmonic rhythm that often supports group jazz improvisation. However, the timing of conversational overlap is crucial; a conversational participant who never (or always) interrupts other speakers will not fully participate. In addition to timing issues, realizing speech materials in a conversation requires control over timbre as well as timing. For example, a sarcastic tone can turn the phrase “Yeah, yeah” from an affirmation into a rejection. The improvisor relies on timbral control and variation as much as rhythmic and melodic development. By analogy with conversation, an improvising system should have four properties. First, it should listen to the musical environment in which it finds itself. Second, it should generate musical material which relates to that environment. Third, it should realize these musical materials with timing that displays awareness of the other performers. Fourth, it should employ timbral control as a coherent part of realizing musical materials. 3. THE IMPROVISATIONBUILDER PROGRAM 3.1. Plan One constraint in this project is that at every stage of development, ImprovisationBuilder must remain usable by a composer in actual performance and concerts. In order to achieve this, we have devised a four step plan: (i) Translate SAL into Smalltalk-80 and add to it some compositional features of the Sal-Mar Construction (ii) Expand the Smalltalk-80 version to include control of a real-time signal processor (iii) Generalize and expand upon the specifics of SAL, culminating in a full conversation-based framework for improvisation (iv) Expand support for other input and output devices 3.2. Progress Report As of this writing we are largely finished with (i) and making considerable progress on (ii). The framework supports four basic kinds of activity, corresponding to the four requirements for improvising systems set forth above. Listeners process the incoming music, parsing it into phrases and focusing the system’s attention. Players create new phrases, either by transforming phrases supplied by the listener or via some compositional algorithm. Players could also supply previously composed phrases. Realizers attempt to express these phrases appropriately, both through timely presentation and timbral control. ImprovisationBuilder is written in ParcPlace Smalltalk-80 on the Apple Macintosh. Smalltalk primitives connect ImprovisationBuilder with input and output devices. Connections to MIDI devices are handled by the Apple MIDI Manager, while connections to the Symbolic Sound Capybara are handled by a driver that controls its operation (see Figure 1). A MusicScheduler dispatches all MIDI and Capybara events, decoupling the Players and Listeners from playback timing.