Contextual awareness, messaging and communication in nomadic audio environments

Nomadic Radio provides an audio-only wearable interface to unify remote information services such as email, voice mail, hourly news broadcasts, and personal calendar events. These messages are automatically downloaded to a wearable device throughout the day and users can browse them using speech recognition and tactile input. To provide an unobtrusive interface for nomadic users, the audio/text information is presented using a combination of ambient and auditory cues, synthetic speech and spatialized audio. A notification model developed in Nomadic Radio dynamically selects the relevant presentation level for incoming messages based on message priority, user activity and the level of conversation in the environment. Temporal actions of the user such as activating or ignoring messages while listening, reinforce or decay the presentation level over time and change the underlying notification model. Scaleable notification allows incoming messages to be dynamically presented as subtle ambient sounds, distinct VoiceCues, spoken summaries or spatialized audio streams foregrounded for the listener. This thesis addresses techniques for peripheral awareness, spatial listening and contextual notification to manage the user’s focus of attention on a wearable audio computing platform. Thesis Supervisor: Christopher M. Schmandt Title: Principal Research Scientist, MIT Media Laboratory

[1]  Pattie Maes,et al.  Agents that reduce work and information overload , 1994, CACM.

[2]  Bradley J. Rhodes,et al.  The wearable remembrance agent: A system for augmented memory , 1997, Digest of Papers. First International Symposium on Wearable Computers.

[3]  Nicole Yankelovich Talking vs taking: speech access to remote computers , 1994, CHI '94.

[4]  W. G. Gardner,et al.  HRTF measurements of a KEMAR , 1995 .

[5]  Meera Blattner,et al.  Dynamic presentation of asynchronous auditory output , 1997, MULTIMEDIA '96.

[6]  Chris Schmandt,et al.  MailCall: message presentation and navigation in a nonvisual environment , 1996, CHI.

[7]  Thad Starner,et al.  The locust swarm: an environmentally-powered, networkless location and messaging system , 1997, Digest of Papers. First International Symposium on Wearable Computers.

[8]  Robert E. Kraut,et al.  Expressive richness: a comparison of speech and text as media for revision , 1991, CHI.

[9]  Lisa Stifelman,et al.  Augmenting real-world objects: a paper-based audio notebook , 1996, CHI Conference Companion.

[10]  Chris Schmandt Chatter: A Conversational Learning Speech Interface , 1994 .

[11]  Nicolas Saint-Arnaud Classification of sound textures , 1995 .

[12]  William W. Gaver,et al.  Effective sounds in complex systems: the ARKOLA simulation , 1991, CHI.

[13]  Barry Arons,et al.  SpeechSkimmer: interactively skimming recorded speech , 1993, UIST '93.

[14]  Neil Gershenfeld,et al.  MIT-Media Lab , 1991, ICMC.

[15]  David G. Novick,et al.  Systematic design of spoken prompts , 1996, CHI.

[16]  Alex Pentland,et al.  Wearable Audio Computing: A Survey of Interaction Techniques , 2000 .

[17]  Scott E. Hudson,et al.  Electronic mail previews using non-speech audio , 1996, CHI Conference Companion.

[18]  Lisa J. Stifelman,et al.  VoiceNotes--an application for a voice-controlled hand-held computer , 1992 .

[19]  T. Feustel,et al.  Capacity Demands in Short-Term Memory for Synthetic and .Natural Speech , 1983, Human factors.

[20]  Jennifer Healey,et al.  Augmented Reality through Wearable Computing , 1997, Presence: Teleoperators & Virtual Environments.

[21]  William W. Gaver The SonicFinder: An Interface That Uses Auditory Icons , 1989, Hum. Comput. Interact..

[22]  Elizabeth D. Mynatt,et al.  Designing audio aura , 1998, CHI.

[23]  Chris Schmandt,et al.  Dynamic Soundscape: mapping time to space for audio browsing , 1997, CHI.

[24]  Lisa Stifelman The cocktail party e ect in auditory interfaces: A study of simultaneous presentation , 1994 .

[25]  Ben Shneiderman,et al.  Visual information seeking: tight coupling of dynamic query filters with starfield displays , 1994, CHI Conference Companion.

[26]  Chris Schmandt,et al.  AudioStreamer: exploiting simultaneity for listening , 1995, CHI 95 Conference Companion.

[27]  Emily S. Patterson,et al.  Voice loops as cooperative aids in space shuttle mission control , 1996, CSCW '96.

[28]  Gregory Kramer,et al.  Auditory Display: Sonification, Audification, And Auditory Interfaces , 1994 .

[29]  Eric Horvitz,et al.  Perception, Attention, and Resources: A Decision-Theoretic Approach to Graphics Rendering , 1997, UAI.

[30]  Matthew Talin Marx,et al.  Toward effective conversational messaging , 1995 .

[31]  C. Schmandt,et al.  Multimedia nomadic services on today's hardware , 1994, IEEE Network.

[32]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[33]  S. Handel Listening As Introduction to the Perception of Auditory Events , 1989 .

[34]  Chris Schmandt,et al.  NewsComm: a hand-held interface for interactive access to structured audio , 1996, CHI.

[35]  Mark S. Ackerman,et al.  Thunderwire: a field study of an audio-only media space , 1996, CSCW '96.

[36]  Chris Schmandt Voice communication with computers: conversational systems , 1994 .

[37]  D. Roy NewsComm : A Hand-Held Device for Interactive Access to Structured Audio , 1995 .

[38]  H. Wallach,et al.  The role of head movements and vestibular and visual cues in sound localization. , 1940 .

[39]  Katashi Nagao,et al.  The world through the computer: computer augmented interaction with real world environments , 1995, UIST '95.

[40]  Barry Arons,et al.  A Review of The Cocktail Party Effect , 1992 .

[41]  David M. Frohlich,et al.  Timespace in the workplace: dealing with interruptions , 1995, CHI 95 Conference Companion.

[42]  Q. Summerfield Book Review: Auditory Scene Analysis: The Perceptual Organization of Sound , 1992 .

[43]  Barry Arons,et al.  VoiceNotes: a speech interface for a hand-held voice notetaker , 1993, INTERCHI.

[44]  Sanjay Manandhar,et al.  Activity server--a model for everday office activities , 1991 .

[45]  Ben Shneiderman,et al.  Visual information seeking: tight coupling of dynamic query filters with starfield displays , 1994, CHI '94.

[46]  Alexander I. Rudnicky,et al.  SPEECHWEAR: a mobile speech system , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[47]  Benjamin B. Bederson,et al.  Audio augmented reality: a prototype automated tour guide , 1995, CHI 95 Conference Companion.

[48]  Donald A. Norman,et al.  Things That Make Us Smart: Defending Human Attributes In The Age Of The Machine , 1993 .

[49]  Elizabeth M. Wenzel,et al.  Localization in Virtual Acoustic Displays , 1992, Presence: Teleoperators & Virtual Environments.

[50]  Alex Pentland,et al.  Extracting context from environmental audio , 1998, Digest of Papers. Second International Symposium on Wearable Computers (Cat. No.98EX215).

[51]  Chris Schmandt,et al.  CLUES: dynamic personalized message filtering , 1996, CSCW '96.

[52]  D H Klatt,et al.  Review of text-to-speech conversion for English. , 1987, The Journal of the Acoustical Society of America.

[53]  Eric Horvitz,et al.  Display of Information for Time-Critical Decision Making , 1995, UAI.

[54]  Atty Thomas Mullins,et al.  AudioStreamer--leveraging the cocktail party effect for efficient listening , 1996 .

[55]  C. L. M. The Psychology of Attention , 1890, Nature.

[56]  Chris Schmandt,et al.  Phoneshell: the telephone as computer terminal , 1993, MULTIMEDIA '93.

[57]  Gary S. Kendall,et al.  A 3-D Sound Primer: Directional Hearing and Stereo Reproduction , 1995 .

[58]  Pattie Maes,et al.  Situational Awareness from Environmental Sounds , 1997 .

[59]  Bill N. Schilit,et al.  Dynomite: a dynamically organized ink and audio notebook , 1997, CHI.

[60]  M. J. Muller,et al.  Toward a definition of voice documents , 1990, COCS '90.