klipp av: Live Algorithmic Splicing and Audiovisual Event Capture

Recent new media concerts feature a trend toward the fuller integration of modalities enabled by close audiovisual collaborations, avoiding the sometimes artificial separation of disc jockey (DJ) and video jockey (VJ), of audio and visual artist. Integrating audio and visual domains has been an artistic concern from the experimental films of such notaries as Oskar Fischinger and Norman McClaren earlier in the 20th century, through 1960s happenings, 1970s analog video synthesizers, and 1980s pop videos, to the current proliferation of VJing, DVD labels, and live cinema (Lew 2004). The rise of the VJ has been allied with the growth in club culture since the 1980s, with Super-8 film and video projectionists at early raves now replaced by "laptopists" armed with commercial digital VJ software like Isadora, Aestesis, Motion Dive, and Arkaos VJ. (An extensive list is maintained at www .audiovisualizers.com.) In much current practice, where a VJ accompanies fixed (pre-recorded) audio, correlation in mapping is usually achieved via a simple spectral analysis of the overall output sound. Graphical objects can be controlled using a downsampled energy envelope in a frequency band. Yet this is a crude solution for live generated audio; fine details in the creation of audio objects should be accessible as video controls. Analogous to the sampling culture within digital music, source material for visual manipulation is often provided by pre-prepared footage or captured live with digital cameras. Synthesis also provides an option for the creation of imagery, and generative graphics are a further staple. Modern programs integrate many different possible sources and effects processes in software interfaces, with external control from MIDI, Open Sound Control (OSC; Wright and Freed 1997), and Universal Serial Bus (USB) devices. Live performance has seen the development of MIDI-triggering software for video clips such as EBN's Video Control System and Coldcut's VJamm (www.vjamm.com), and turntable tracking devices applied as control interfaces to video playback like Final Scratch (www.finalscratch.com) and MsPinky (www.mspinky.com). The influential audiovisual sampling group Coldcut performs live by running precomposed or keyboard-performed MIDI sequences from Ableton Live as control inputs to their VJamm software, triggering simultaneous playback of video clips with their soundtracks. They have not explored, however, the use of captured audio and video, nor the real potential of algorithmic automation of such effects. Techniques will be described in this article that make this natural step. Customizable graphical programming languages like Max/MSP (with the nato.0+55, softVNS2, and Jitter extensions), Pure Data (Pd, with GEM, PDP, and GridFlow extensions), or jMax (with the DIPS, or Digital Image Processing with Sound, package of Matsuda et al. 2002) can cater to those who see no a priori separation of the modalities and wish to generate both concurrently, from a common algorithm or with some form of internal message passing. Other authors choose to define their own protocols for information transfer between modality-specific applications (Betts 2002; Collins and Olofsson 2003), perhaps using a network protocol like OSC to connect laptops. The heritage of the VJ is somewhat independent of another tradition that has combined music and image, namely film. It is important to be cautious about the direct application of film music theory, particularly wherever the subservience of nondiegetic music is trumpeted. The emotional underscoring of narrative concerns in typical orchestral film music is certainly not the state of play in a club Computer Music Journal, 30:2, pp. 8-18, Summer 2006 ? 2006 Massachusetts Institute of Technology.

[1]  Tristan Jehan EVENT-SYNCHRONOUS MUSIC ANALYSIS / SYNTHESIS , 2004 .

[2]  M. Posner,et al.  Visual dominance: an information-processing account of its origins and significance. , 1976, Psychological review.

[3]  Daichi Ando,et al.  DIPS for Linux and Mac OS X , 2002, ICMC.

[4]  Nick Collins ON ONSETS ONTHE-FLY : REAL-TIME EVENT SEGMENTATION AND CATEGORISATION AS A COMPOSITIONAL EFFECT , 2004 .

[5]  Nicholas Cook,et al.  Analysing Musical Multimedia , 1998 .

[6]  Xavier Rodet,et al.  Automatic Characterisation of Musical Signals: Feature Extraction and Temporal Segmentation , 1999 .

[7]  Anssi Klapuri,et al.  Sound onset detection by applying psychoacoustic knowledge , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[8]  James McCartney,et al.  Rethinking the Computer Music Language: SuperCollider , 2002, Computer Music Journal.

[9]  Nick Collins,et al.  A Protocol for Audiovisual Cutting , 2003, ICMC.

[10]  Ernst Piippel A hierarchical model of temporal perception , 1997 .

[11]  Thomas Baer,et al.  A model for the prediction of thresholds, loudness, and partial loudness , 1997 .

[12]  Nick Collins The BBCut Library , 2002, ICMC.

[13]  Michael Lew,et al.  Live Cinema: Designing an Instrument for Cinema Editing as a Live Performance , 2004, NIME.

[14]  Laboratorio Nacional de Música Electroacústica Proceedings of the 2001 International Computer Music Conference, ICMC 2001, Havana, Cuba, September 17-22, 2001 , 2001, ICMC.

[15]  Matthew Wright,et al.  Open SoundControl: A New Protocol for Communicating with Sound Synthesizers , 1997, ICMC.

[16]  Roger Dean,et al.  Hyperimprovisation: Computer-Interactive sound improvisation , 2003 .

[17]  Nicola Jane Phillips Audio-visual scene analysis : attending to music in film. , 2000 .

[18]  Nick Collins A Comparison of Sound Onset Detection Algorithms with Emphasis on Psychoacoustically Motivated Detection Functions , 2005 .

[19]  Zhu Liu,et al.  Multimedia content analysis-using both audio and visual clues , 2000, IEEE Signal Process. Mag..

[20]  Mark D. Plumbley,et al.  Fast labelling of notes in music signals , 2004, ISMIR.