Tools and experiments in multimodal interaction

Author Tommi Ilmonen Title Tools and Experiments in Multimodal Interaction The goal of this study is to explore different strategies for multimodal human-computer interaction. Where traditional human-computer interaction uses a few common user interface metaphors and devices, multimodal interaction seeks new application areas with novel interaction devices and metaphors. Exploration of these new areas involves creation of new application concepts and their implementation. In some cases the interaction mimics human-human interaction while in other cases the interaction model is only loosely tied to the physical world. In the virtual orchestra concept a conductor can conduct a band of virtual musicians. Both the motion and sound of the musicians is synthesized with a computer. A critical task in this interaction is the analysis of the conductor motion and control of the sound synthesis. A system that performs these tasks is presented. The system is also capable of extracting emotional content from the conductor’s motion. While the conductor follower system was originally developed using a commercial motion tracker, an alternative low-cost motion tracking system was also made. The new system used accelerometers with application-specific signal processing for motion capture. One of the basic tasks of the conductor follower and other gesture-based interaction systems is to refine raw user input data into information that is easy to use in the application. For this purpose a new approach was developed: FLexible User Input Design (FLUID). This is a toolkit that simplifies the management of novel interaction devices and offers general-purpose data conversion and analysis algorithms. FLUID was used in a virtual reality drawing applications AnimaLand and Helma. Also new particle system models and a graphics distribution system were developed for these applications. The traditional particle systems were enhanced by adding moving force fields that interact with each other. The interacting force fields make the animations more lively and credible. Graphics distribution become an issue if one wants to render 3D graphics with a cost-effective PC-cluster. A graphics distribution method based on network broadcast was created to minimize the amount of data traffic, thus increasing performance. Many multimodal applications also need a sound synthesis and processing engine. To meet these needs the Mustajuuri toolkit was developed. Mustajuuri is a flexible and efficient sound signal processing framework with support for sound processing in virtual environments.

[1]  Jakub Wejchert,et al.  Animation aerodynamics , 1991, SIGGRAPH.

[2]  Joseph J. LaViola,et al.  CavePainting: a fully immersive 3D artistic medium and interactive experience , 2001, I3D '01.

[3]  Tapio Takala,et al.  Possibilities and limitations of immersive free-hand expression: a case study with professional artists , 2004, MULTIMEDIA '04.

[4]  Carla Maria Dal Sasso Freitas,et al.  Cooperative object manipulation in immersive virtual environments: framework and techniques , 2002, VRST '02.

[5]  Marshall D. Brain Motif Programming: The Essentials... and More , 1992 .

[6]  Shigeo Morishima,et al.  An evaluation of 3-D emotion space , 1995, Proceedings 4th IEEE International Workshop on Robot and Human Communication.

[7]  Craig W. Reynolds Flocks, herds, and schools: a distributed behavioral model , 1987, SIGGRAPH.

[8]  Tapio Takala,et al.  Soft Edges and Burning Things: Enhanced Real-Time Rendering of Particle Systems , 2006 .

[9]  Mark J. Kilgard,et al.  OpenGL programming for the X Window system(日本語版) , 1996 .

[10]  Duc Quang Nguyen,et al.  Physically based modeling and animation of fire , 2002, ACM Trans. Graph..

[11]  Greg Humphreys,et al.  Chromium: a stream-processing framework for interactive rendering on clusters , 2002, SIGGRAPH.

[12]  Tapio Takala Virtual orchestra performance , 1997, SIGGRAPH '97.

[13]  Tapio Lokki,et al.  Implementation issues of 3D audio in a virtual room , 2001, IS&T/SPIE Electronic Imaging.

[14]  Marc Erich Latoschik A gesture processing framework for multimodal interaction in virtual reality , 2001, AFRIGRAPH '01.

[15]  M. Csíkszentmihályi Creativity: Flow and the Psychology of Discovery and Invention , 1996 .

[16]  Kulwinder Kaur,et al.  Designing Virtual Environments for Usability , 1997 .

[17]  Jennifer Healey,et al.  Digital processing of affective signals , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[18]  Emmanuel Munguia Tapia,et al.  AltarNation: interface design for meditative communities , 2002, CHI Extended Abstracts.

[19]  Char Davies,et al.  Osmose: towards broadening the aesthetics of virtual reality , 1996, COMG.

[20]  Jérémie Allard,et al.  SoftGenLock: active stereo and genlock for PC cluster , 2003 .

[21]  Brian F. Goldiez,et al.  Software infrastructure for multi-modal virtual environments , 2004, ICMI '04.

[22]  Mark Weiser,et al.  The computer for the 21st Century , 1991, IEEE Pervasive Computing.

[23]  Dan Overholt,et al.  Hydrogen wishes , 2003, SIGGRAPH '03.

[24]  Satoshi Usa,et al.  A Multi-modal Conducting Simulator , 1998, ICMC.

[25]  Ville Pulkki,et al.  Virtual Sound Source Positioning Using Vector Base Amplitude Panning , 1997 .

[26]  Janne Jalkanen,et al.  How to build a virtual room , 2001, IS&T/SPIE Electronic Imaging.

[27]  Perttu Hämäläinen,et al.  Martial arts in artificial reality , 2005, CHI.

[28]  Mary C. Whitton,et al.  Walking > walking-in-place > flying, in virtual environments , 1999, SIGGRAPH.

[29]  Rosalind W. Picard,et al.  Expression glasses: a wearable device for facial expression recognition , 1999, CHI Extended Abstracts.

[30]  Gerold Wesche,et al.  Conceptual free-form styling on the responsive workbench , 2000, VRST '00.

[31]  Forrest Tobey The Ensemble Member and the Conducted Computer , 1995, ICMC.

[32]  Ruth Christie,et al.  Understanding next-generation VR: classifying commodity clusters for immersive virtual reality , 2004, GRAPHITE '04.

[33]  Carolina Cruz-Neira,et al.  VR Juggler: a virtual platform for virtual reality application development , 2001, Proceedings IEEE Virtual Reality 2001.

[34]  S Puckette Miller,et al.  Pure Data : another integrated computer music environment , 1996 .

[35]  Eric Klein,et al.  Dirt cheap 3-D spatial audio , 2005 .

[36]  Jos Stam,et al.  Stable fluids , 1999, SIGGRAPH.

[37]  Ronald Fedkiw,et al.  Visual simulation of smoke , 2001, SIGGRAPH.

[38]  Eugene Fiume,et al.  Depicting fire and other gaseous phenomena using diffusion processes , 1995, SIGGRAPH.

[39]  Sharon L. Oviatt,et al.  Multimodal Interaction for 2D and 3D Environments , 1999, IEEE Computer Graphics and Applications.

[40]  Teresa Marrin Toward an understanding of musical gesture : mapping expressive intention with the digital baton , 1996 .

[41]  Myron W. Krueger,et al.  Artificial reality II , 1991 .

[42]  Tsutomu Miyasato,et al.  Emotion recognition from audiovisual information , 1998, 1998 IEEE Second Workshop on Multimedia Signal Processing (Cat. No.98EX175).

[43]  Norman G. Vinson,et al.  Design guidelines for landmarks to support navigation in virtual environments , 1999, CHI '99.

[44]  Philippe Coiffet,et al.  Virtual Reality Technology , 2003, Presence: Teleoperators & Virtual Environments.

[45]  Gregory D. Abowd,et al.  The context toolkit: aiding the development of context-enabled applications , 1999, CHI '99.

[46]  Jos Stam,et al.  Interacting with smoke and fire in real time , 2000, CACM.

[47]  Russell M. Taylor,et al.  VRPN: a device-independent, network-transparent VR peripheral system , 2001, VRST '01.

[48]  Antonella De Angeli,et al.  Visual display, pointing, and natural language: the power of multimodal interaction , 1998, AVI '98.

[49]  Declan Murphy,et al.  Conducting Audio Files via Computer Vision , 2003, Gesture Workshop.

[50]  Doug A. Bowman,et al.  Travel in immersive virtual environments: an evaluation of viewpoint motion control techniques , 1997, Proceedings of IEEE 1997 Annual International Symposium on Virtual Reality.

[51]  Carolina Cruz-Neira,et al.  Surround-Screen Projection-Based Virtual Reality: The Design and Implementation of the CAVE , 2023 .

[52]  Roy S. Kalawsky,et al.  Human Factors Evaluation Techniques to Aid Understanding of Virtual Interfaces , 1999 .

[53]  Markku Reunanen,et al.  Näprä: affordable fingertip tracking with ultrasound , 2005, EGVE'05.

[54]  Miller Puckette,et al.  Max at Seventeen , 2002, Computer Music Journal.

[55]  Mark Billinghurst,et al.  Put that where? voice and gesture at the graphics interface , 1998, COMG.

[56]  Doug A. Bowman,et al.  An evaluation of techniques for grabbing and manipulating remote objects in immersive virtual environments , 1997, SI3D.

[57]  Rynson W. H. Lau,et al.  Collaborative distributed virtual sculpting , 2001, Proceedings IEEE Virtual Reality 2001.

[58]  Mike Wu,et al.  Multi-finger and whole hand gestural interaction techniques for multi-user tabletop displays , 2003, UIST '03.

[59]  Marcelo M. Wanderley,et al.  Recognition, Analysis and Performance with Expressive Conducting Gestures , 2004, ICMC.

[60]  Tapio Lokki,et al.  Upponurkka: An Inexpensive Immersive Display for Public VR Installations , 2006, IEEE Virtual Reality Conference (VR 2006).

[61]  Teresa Marrin Nakra,et al.  The "Conductor's Jacket": A Device for Recording Expressive Musical Gestures , 1998, ICMC.

[62]  Kenneth R Britting,et al.  Inertial navigation systems analysis , 1971 .

[63]  Lambert Schomaker,et al.  A taxonomy of Multimodal Interaction in the Human Information Processing System , 1995 .

[64]  Sebastian Thrun,et al.  A Gesture Based Interface for Human-Robot Interaction , 2000, Auton. Robots.

[65]  Tapio Lokki,et al.  Sound signal processing for a virtual room , 2000, 2000 10th European Signal Processing Conference.

[66]  Oliver Bimber,et al.  A multi-layered architecture for sketch-based interaction within virtual environments , 2000, Comput. Graph..

[67]  Richard A. Bolt,et al.  “Put-that-there”: Voice and gesture at the graphics interface , 1980, SIGGRAPH '80.

[68]  James A. Landay,et al.  Sketching Interfaces: Toward More Human Interface Design , 2001, Computer.

[69]  Shuji Hashimoto,et al.  Knowledge Information Processing in Conducting Computer Music Performer , 1990, ICMC.

[70]  Roberto Bresin,et al.  Emotional expression in music performance : synthesis and decoding , 1998 .

[71]  Joëlle Coutaz,et al.  A design space for multimodal systems: concurrent processing and data fusion , 1993, INTERCHI.

[72]  G. Tortora Introduction to the Human Body: The Essentials of Anatomy and Physiology , 1997 .

[73]  Frederick P. Brooks What's Real About Virtual Reality? , 1999, IEEE Computer Graphics and Applications.

[74]  Vesa Välimäki,et al.  An Integer System for Virtual Audio Reality , 1996 .

[75]  Jérémie Allard,et al.  Coupling Parallel Simulation and Multi-display Visualization on a PC Cluster , 2003, Euro-Par.

[76]  Frederick P. Brooks,et al.  Moving objects in space: exploiting proprioception in virtual-environment interaction , 1997, SIGGRAPH.

[77]  Anders Friberg,et al.  Emotional Coloring of Computer-Controlled Music Performances , 2000, Computer Music Journal.

[78]  Tovi Grossman,et al.  Multi-finger gestural interaction with 3d volumetric displays , 2004, UIST '04.

[79]  Sharon L. Oviatt,et al.  Ten myths of multimodal interaction , 1999, Commun. ACM.

[80]  Alex Pentland,et al.  Automatic spoken affect classification and analysis , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[81]  David K. McAllister The Design of an API for Particle Systems , 2000 .

[82]  Shuji Hashimoto,et al.  A computer music system that follows a human conductor , 1991, Computer.

[83]  Ronald Fedkiw,et al.  Practical animation of liquids , 2001, SIGGRAPH.

[84]  Jérémie Allard,et al.  Net Juggler: running VR Juggler with multiple displays on a commodity component cluster , 2002, Proceedings IEEE Virtual Reality 2002.

[85]  Bernd Hamann,et al.  A survey and performance analysis of software platforms for interactive cluster-based multi-screen rendering , 2003, IPT/EGVE.

[86]  Brock McElheran,et al.  Conducting Technique: For Beginners and Professionals , 1966 .

[87]  Marcelo M. Wanderley,et al.  Performance Gestures of Musicians: What Structural and Emotional Information Do They Convey? , 2003, Gesture Workshop.

[88]  Gerold Wesche,et al.  Towards Immersive Modeling - Challenges and Recommendations: A Workshop Analyzing the Needs of Designers , 2000, EGVE.

[89]  Jos Stam,et al.  Flows on surfaces of arbitrary topology , 2003, ACM Trans. Graph..

[90]  S. Palmer Vision Science : Photons to Phenomenology , 1999 .

[91]  Frank Dellaert,et al.  Recognizing emotion in speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[92]  H. Hashimoto,et al.  Pattern recognition of emotion with neural network , 1995, Proceedings of IECON '95 - 21st Annual Conference on IEEE Industrial Electronics.

[93]  Dominic W. Massaro,et al.  A framework for evaluating multimodal integration by humans and a role for embodied conversational agents , 2004, ICMI '04.

[94]  Paul Dourish,et al.  Where the action is , 2001 .

[95]  Camille Goudeseune,et al.  Syzygy: native PC cluster VR , 2003, IEEE Virtual Reality, 2003. Proceedings..

[96]  Pat Hanrahan,et al.  Designing graphics architectures around scalability and communication , 2001 .

[97]  Edouard Lamboray,et al.  The blue-c distributed scene graph , 2003 .

[98]  Gregory D. Abowd,et al.  Providing integrated toolkit-level support for ambiguity in recognition-based interfaces , 2000, CHI.

[99]  Thomas B. Moeslund,et al.  A Procedure for Developing Intuitive and Ergonomic Gesture Interfaces for HCI , 2003, Gesture Workshop.

[100]  Arnauld Lamorlette,et al.  Structural modeling of flames for a production environment , 2002, SIGGRAPH.

[101]  Rosalind W. Picard Affective Computing , 1997 .

[102]  Tapio Takala,et al.  Collision Avoidance and SurfaceFlow for Particle Systems Using Distance/Normal Grid , 2006 .

[103]  Jian Yang,et al.  Design and implementation of a large-scale hybrid distributed graphics system , 2002, EGPGV.

[104]  Guy E. Garnett,et al.  Virtual Conducting Practice Environment , 1999, ICMC.

[105]  Thomas Alexander,et al.  Approach for software development of parallel real-time VE systems on heterogenous clusters , 2002, EGPGV.

[106]  Peter Schröder,et al.  Surface drawing: creating organic 3D shapes with the hand and tangible tools , 2001, CHI.

[107]  Kostas Karpouzis,et al.  Facial Expression and Gesture Analysis for Emotionally-Rich Man-Machine Interaction , 2004 .

[108]  Sofia Dahl,et al.  Expressiveness of Musician's Body Movements in Performances on Marimba , 2003, Gesture Workshop.

[109]  Trevor Darrell,et al.  A multi-modal approach for determining speaker location and focus , 2003, ICMI '03.

[110]  Doug A. Bowman,et al.  A Survey of Usability Evaluation in Virtual Environments: Classification and Comparison of Methods , 2002, Presence: Teleoperators & Virtual Environments.

[111]  Hiroshi Ishii,et al.  Tangible bits: towards seamless interfaces between people, bits and atoms , 1997, CHI.

[112]  Anders Friberg,et al.  Automatic musical punctuation : A rule system and a neural network approach , 1997 .

[113]  Gordon Stoll,et al.  Lightning-2: a high-performance display subsystem for PC clusters , 2001, SIGGRAPH.

[114]  Gordon Stoll,et al.  WireGL: a scalable graphics system for clusters , 2001, SIGGRAPH.

[115]  Peter Robinson,et al.  The use of gestures in multimodal input , 1998, Assets '98.

[116]  Frits H. Post,et al.  IntenSelect: using dynamic object rating for assisting 3D object selection , 2005, EGVE'05.

[117]  Mark Lehto,et al.  A review of: “Virtual Reality Technology” Grigore Burdea and Philippe Coiffet John Wiley & Sons, Inc., 1994 , 1996 .

[118]  Caroline Hummels,et al.  Meaningful gestures for human computer interaction: beyond hand postures , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[119]  B. Moore An Introduction to the Psychology of Hearing , 1977 .

[120]  Jessica K. Hodgins,et al.  Animating explosions , 2000, SIGGRAPH.

[121]  Deborah Hix,et al.  User-Centered Design and Evaluation of Virtual Environments , 1999, IEEE Computer Graphics and Applications.

[122]  Guy E. Garnett,et al.  Technological Advances for Conducting a Virtual Ensemble , 2001, ICMC.

[123]  B. Wandell Foundations of vision , 1995 .

[124]  Giuseppe Monno,et al.  The SenStylus: A Novel Rumble-Feedback Pen Device for CAD Application in Virtual Reality , 2005, WSCG.

[125]  Richard Boulanger,et al.  Developing a Windows and Macintosh Graphical User Interface for the Mathews Radio-Baton System , 2000, ICMC.

[126]  Norbert Elias Time: An Essay , 1993 .

[127]  George Lakoff,et al.  Women, Fire, and Dangerous Things , 1987 .

[128]  Max V. Mathews The radio baton and conductor program, or, pitch, the most important and least expressive part of music , 1991 .

[129]  Tapio Lokki,et al.  Realtime audiovisual rendering and contemporary audiovisual art , 1998, Organised Sound.

[130]  Jeff Lander The Ocean Spray in Your Face , 1998 .

[131]  Robert J. Schalkoff,et al.  Pattern recognition - statistical, structural and neural approaches , 1991 .

[132]  Ricki Blau,et al.  Approximate and probabilistic algorithms for shading and rendering structured particle systems , 1985, SIGGRAPH.

[133]  Ichiro Fujinaga,et al.  Extraction of Conducting Gestures in 3D Space , 1996, ICMC.

[134]  Ming C. Lin,et al.  ArtNova: touch-enabled 3D model design , 2002, Proceedings IEEE Virtual Reality 2002.

[135]  Karl Sims,et al.  Particle animation and rendering using data parallel computation , 1990, SIGGRAPH.

[136]  Ali Momeni,et al.  OpenSound Control: State of the Art 2003 , 2003, NIME.

[137]  Doug A. Bowman,et al.  Testbed Evaluation of Virtual Environment Interaction Techniques , 2001, Presence Teleoperators Virtual Environ..