A two-level method to control granular synthesis

This paper presents a formal system for the algorithmic control of composition based on granular synthesis. The system features two description levels: a low level, that organizes grains into a graph structure, and a high level, that distributes the graphs of the low level in specific locations of a space. A composition is a trajectory in the space, appropriately interpreted to control a number of parameters of physical and musical relevance. The paper is organized as follows: first, we introduce the composition process with granular synthesis and we briefly outline the current approaches to control; second, we describe the formal system in terms of the two levels that compose it; finally, we see how the system can be viewed as a generalization of the note approach and the stochastic approach. 1. THE COMPOSITION WITH GRANULAR SYNTHESIS Granular synthesis is a general term that encompasses various kinds of synthesis techniques based on a grain representation of sound, i.e. sonic events are built from “elementary sonic elements” of very short duration ([1]; as a general reference, see [2]). Different organization techniques can lead to very different timbral and compositional results (see [1], [2], [3], [4]). So, one of the main questions arising while working with grains is how to move from single grain level (microstructure) up to compositional design (macrostructure), possibly passing through note level (ministructure) and rhythm level (mesostructure) (following [5]: 266; see also [2]: 3). We can distinguish two major approaches: the note approach and the stochastic approach. In the case of the note approach, the focus is on microstructure as embedded in ministructure: so, ministructure defines the sound objects and granularity defines the timbre of each object (i.e. drum roll, rolled phonemes, flutter-tongue, [6]: 56). Granular synthesis and granulation of existing sound objects are methods to create/transform elements at the “note” level. As in traditional composition, there is a logic gap between sound and structure ([7]). This is the approach implemented in grain-based modules of DSP applications ([8], [1], [4]), and in Csound built-in opcodes ([9]). More radically, in the stochastic approach, granularity is intended as a compositional feature. Having to work with a pulviscolar matter, composers involved in granular synthesis have often decided to avoid an “instrumental-music approach” to promote textural shaping as a general compositional feature, in order to “unite sound and structure” ([7]: 120). Various stochastic methods and strategies have been used to control grain densities, distribution in frequency spectrum, waveshape in the time course (see the “classic” works by Xenakis, Roads, Truax). In Xenakis ([5]), the sound is thought as an evolving gas structure. The audible field is modelled according to the Fletcher-Munson diagram, which is subdivided in a finite number of cells. Each instant is described through the stochastic activation of certain cells in the diagram (a “screen”) and each screen has a fixed temporal duration. The sound/composition is an aggregatum of screens collected in a “book” in “lexicographic” order (as in the series of sections of a tomography). In Truax ([10], [7]), massive sound texture is obtained via the juxtaposition of multiple grain streams (“voices”, like in polyphony): the parameters of each grain stream are controlled through tendency masks representing variations over time (i.e. timbre selection, frequency range, temporal density, [7], [10]). This approach is well known in the literature as Quasi-Synchronous Granular Synthesis (QSGS, [11], [4], [2]). In Roads [11], grains are scattered probabilistically over frequency/time plain regions (“clouds”). The compositional work relies on controlling cloud global parameters (i.e. start time and duration of the cloud, grain duration, density of grains, etc.). In these three cases, compositional strategies are based on the direct control of the creative process with an empty uniform time/frequency canvas. Not surprisingly, the compositional metaphor in Roads is explicitly related to painting, using different brushes with different (sound) colours ([11]: 143). The goal of this paper is to provide a new perspective on the composition process with granular synthesis by introducing a formal system based on two description levels. As we see below, the system can be viewed as a generalization of the note and the stochastic approaches. 1 See also the graphic notation in [12]: 156-57. This is the standard spatial metaphor in different granular synthesis implementations using tendency masks: in a Csound-oriented perspective, see for example the software GSC4 ([13]) and Cmask ([14]). Proceedings of the XIV Colloquium on Musical Informatics (XIV CIM 2003), Firenze, Italy, May 8-9-10, 2003 CIM-2 2. GEOGRAPHY: A TWO-LEVEL SYSTEM FOR GRAIN GENERATION AND CONTROL STRUCTURE In this section we describe the formal system GeoGraphy, that models the composition process with two components, one for the generation of grain sequences, another for the parametric control of waveforms. First, we introduce some terminology. A composition is a set of tracks; each track is a grain sequence (Figure 1), where the single grains are waveforms that result from granular synthesis and parametric control. The formal system GeoGraphy consists of two components: a graph-based generator of grain sequences (i.e. tracks), and a map-based controller of grain waveform parameters. The grain generator (level I) is based on directed graphs, actually a multigraph ([15]), as it is possible to have more than one edge between two vertices (Figure 2). A vertex represents a grain; an edge represents the sequencing relation between two grains. Grains can be either sampled waveforms with fixed durations, or waveforms generated by a synthesis process with a duration that is overtly marked on the vertex (all the vertices in Figure 2 represent sampled grains). A label on an edge represents the temporal distance between the onset times of the two grains connected by the edge itself. A grain sequence is a path on the graph, that in case a graph contains loops can also be infinite. The generation of a grain sequence is achieved through the insertion of dynamic elements into the graph, called graph actants. A graph actant is initially associated with a vertex (that becomes the origin of a path); then the actant navigates the graph by following the directed edges according to some probability distribution. Multiple independent graph actants can navigate a graph structure at the same time, thus producing more than one grain sequence. 2 All durations in the formalism can be made dependent on some probability distribution, so to act as a general stochastic grain generator. This feature together with the track (or voice) structure of the musical piece allows GeoGraphy to simulate the expressive power of QSGS. 3 For those readers that are familiar with Petri nets, a graph actant can be viewed as a token. The probabilistic control of the token also reminds to stochastic Petri nets. (a) (b) (c) In Figure 2 there are three examples of graphs. The graph in Figure 2a is a multigraph with several connections (almost completely connected). It also contains loops. One possible result is in Figure 3, where some amplitude control, a typical Gaussian envelope, has been applied to avoid clicking. Starting from vertex 4, the graph actant generates a grain of duration 43 milliseconds (vertex label), then it reaches vertex 1 with a delay of 124 milliseconds (edge label), it loops two times on vertex 1, generating two grains of 51 milliseconds with a delay of 63 milliseconds, then it leaves vertex 1 for vertex 2 and so on. As grain duration and delay of attack time are independent, it is possible to superpose grains (vertex label > edge label, see the last two grains in Figure 3). In Figure 2b there is a graph with one vertex and one edge that loops on the unique vertex. The grain sequence produced by such a graph is the exact repetition of the grain associated with the vertex; each repetition starts after 63 milliseconds with respect to the beginning of the previous repetition. In Figure 2c there is a graph consisting of three disconnected subgraphs, each with one vertex and three edges that loop on the vertex itself. If we assume a single actant on each graph, the system generates three simultaneous streams of grains. If we associate each vertex a grain of fixed frequency, we yield a spectrum consisting in three rows (Figure 4), a “stratus” in Roads’ typology ([11]: 165, [2]: 104). In order to control the setting of the parameters associated with the grain waveforms, the idea implemented in the GeoGraphy system (level II) is to position the graphs in a space, and then to control the parameter values by navigating the space (control space or map of graphs – Figure 5). Once the single sound streams have been defined through the generation of graphs, the composer distributes the graphs onto a map, and then designs a trajectory that allows to decide how the several sound streams contribute to the piece. The control of the parameters occurs with reference to the spatial metaphor: parameters value ranges are mapped onto spatial distance, and the nearer is a trajectory to some vertex, the higher is the value of some parameter for the grain waveform represented by that vertex. Figure 1: A musical piece Figure 2: (a) A complex multigraph; (b) A graph of one vertex and one edge; (c) A graph consisting of three disconnected subgraphs.