Evolutionary Computation for Music Evolution
暂无分享,去创建一个
In this article we describe an application of Evolutionary Computation to Algorithm Composition. The individuals of the population were defined as groups of four voices: soprano, contralto, tenor and bass, or “chords”; and they are potential solutions for a selection process. Each choir was evaluated under three criteria: melody, harmony and octave. Based on the ordering of consonance of musical intervals we use the notion of approximating a sequence of notes to its harmonically compatible note or tonal center. Tonal centers can be thought of as an approximation of the melody describing its flow. This method uses fuzzy formalism and is posed as an optimization problem based on the physiological factors relevant to hearing music. This approach is significant because it does not adopt any heuristics. The resultant program is called Vox Populi and it is used to generate sound output in real time. Introduction The composition of the melody, harmony and octave criteria defined the fitnessof a group to the selection function applied. This function returns the “ best individual ”, or “best chord”, according to the aspects measured. A sequence of best chords is called here “ choir” . The selected group is treated as a set of MIDI notes and played. The resulting system, VOX POPULI, allows the user to modify the fitness function through three controls, the first criterion is melodic; the second one is related to the duration of the genetic cycles and music rhythm and the third one is for the set of octaves to be considered. Definition of Population as MIDI data Some concepts are fundamental do the understanding of the article. An auditory event can be characterized in our purposes by four parameters: pitch, duration, timbre and loudness. Pitch can be defined as the auditory property of a note that is conditioned by its frequency relative to the other notes. The range of musical pitch has been defined as the range within which the interval of an octave can be perceived. This has been found to correspond roughly to the range of the piano. From this continuum of frequencies, a set of discrete frequencies is selected in such a way that the frequencies bear a definite interval relationship to another. So, pitch in the musical sense corresponds to a frequency that is selected from a predefined repertoire. In this scheme, two discrete frequencies are chosen in the interval of an octave such that the ratio between any two adjacent frequencies is 2 . This interval ratio in music terminology is termed a semitone. Timbre is the individuality of sound acquired by the addition of harmonics to the fundamental pitch. This is characteristic of a given musical instrument and the mode of playing it. Loudness is that aspect of an auditory event related to its intensity, and duration is characterized by the period of time for which the event is perceivable. With the above notion of an auditory event, melody is defined as a fixed temporal ordering of auditory events. So, a melody represented in conventional occidental notation resembles a system of cartesian coordinates. The pitch and duration are carefully marked; timbre is decided by the instrument for which it is written, and loudness is marked more crudely [Vidyamurthy 1992]. The Rhythmic Genetic Cycle The general architecture of the rhythmic genetic cycle is shown in Fig. 1. The individuals of the population are defined as groups of pitch related to four voices. Initially, the voices’ pitch material is randomly generated in the interval [0..127], which correspond to the possible values for note attribute in the MIDI table. In each era, 30 choirs are generated. The choir is internally represented as a chromosome of 28 bits, or 4 words of 7 bits, one for each voice. Fig. 1 – The rhythmic genetic cycle We can see from the picture that there are two cooperative processes in the genetic cycle, one producing notes and the other consuming notes. Once the initial population of individuals has been created, the fitness of each choir is evaluated. The fitness function is defined as a composition of three sub-functions: the harmonic fitness, the melodic fitness and the octave fitness. For the evaluation of the harmonic fitness and the melodic fitness the consonance criterion is considered. After the fitness evaluation, typical operations of genetic programming like crossover and mutation are applied to the individuals, according to probabilistic taxes. Once the best chord is selected, it is put available to be played. The second process which is looking for new available notes, plays the notes. The following steps are realized in the genetic cycle: Step 1: Create an initial population randomly; Step 2: Until the termination criterion has been satisfied, perform the following: • Evaluate the fitness of each individual in the population; • Apply the genetic operators to individual chromosomes, or groups of voices, chosen with a probability based on fitness, to create a new population. That is: Reproduction: Copy existing individual strings to the new population; Crossover: Create two new chromosomes by crossing over randomly chosen sub-lists (sub-strings) from two existing chromosomes; Mutation: Create new chromosomes from an existing one by randomly mutating the character in the list. Step 3: Designate the best individual that appeared in any generation as the result [2]. The steps above were detailed in order to make visible the many operations realized in each cycle. Despite the fact that there is a medium time cycle to designate the best individual in each generation, these small 1 0 1 0 1 1 1 1 0 1 1 1 1 1 1 0 1 0 1 1 1 1 1 0 0 1 0 1 0 0 1 1 0 1 1 0 0 1 0 1 1 0 0 1 1 0 1 0 1 0 1 0 0 1 1 1 1 0 1 1 1 1 1 1 0 1 0 1 1 1 0 0 1 0 1 1 1 0 1 0 0 1 1 1 Fitness C rossover Repro ductio n Mutation Interface Bes t cho ir variations in each time cycle determine the genetic rhythm. The time interval between the selection of the best choirs in two successive cycles is different. In the other side, the consuming process is regularly “asking for new notes”. When the best chord is selected, and consequently is available, the notes that constitute it are played until the next chord is selected. Once the next chord is selected the new notes are played. The different times of the notes being played define the rhythm and the melody of the genetic cycle. The Voices Encoding The voices are associated to bass (B), tenor (T), contralto (C), soprano (S) and NH (no human) they are called in fuzzy theory as linguistic values. The related fuzzy sets are shown in Fig. 2. Each voice is assigned to a value in {NH, B, T, C, S}. For the classification of each voice the membership function is evaluated for each set and the maximum value is taken. In case of coincidence, the distance to the center of the fuzzy set is considered. All the interval is possible to the voices generation but the voices with linguistic value NH are discarded in the selection process. The considered interval of voices reached by the human voices is H = [39..84]. Fig.2 – The linguistic values associated to the voices The fitness function is defined as a composition of three sub-functions: the harmonic fitness, the melodic fitness and the octave fitness. For the evaluation of the harmonic fitness and the melodic fitness the consonance criterion is considered. The Octave Criterion Once the voices of each choir are evaluated according to its distribution in the interval of voices, the octave criterion returns a value in the set {NH, W, M, G, E}. These linguistic values are associated to the concepts No Human, Weak, Medium, Good and Excellent. The optimal case – Excellent is considered when the choir contains the voices Bass, Tenor, Contralto and Soprano. In this case, Nvalues = 4. None of these voices returns NH; one returns W; two returns M; three returns G; with Nvalues = 0, 1, 2, 3, respectively. Therefore, the octave fitness is evaluated as:
[1] Jaishankar Chakrapani,et al. Cognition of Tonal Centers: A Fuzzy Approach , 1992 .
[2] W. Pedrycz,et al. An introduction to fuzzy sets : analysis and design , 1998 .