ENHANCING SONIC BROWSING USING AUDIO INFORMATION RETRIEVAL

Collections of sound and music of increasing size and diversity are used both by typical computer users and multimedia designers. Browsing audio collections poses several challenges to the design of effective user interfaces. Recent techniques in audio information retrieval allow the automatic extraction of audio content information. This information can be used to inform and enhance audio browsing tools. In this paper we describe how audio information retrieval can be utilized to create novel user interfaces for browsing of audio collections. More specifically we report on recent work on two system prototypes: the Sonic Browser and Marsyas and our current work on merging the two systems in a common flexible system.

[1]  Mark D. Apperley,et al.  A review and taxonomy of distortion-oriented presentation techniques , 1994, TCHI.

[2]  I. Jolliffe Principal Component Analysis , 2002 .

[3]  Toshiyuki Asahi,et al.  Sound retrieval with intuitive verbal expressions , 1998 .

[4]  Emilie M. Roth,et al.  CAN WE EVER ESCAPE FROM DATA OVERLOAD , 2002 .

[5]  Barry Arons,et al.  A Review of The Cocktail Party Effect , 1992 .

[6]  Ben Shneiderman,et al.  Tree-maps: a space-filling approach to the visualization of hierarchical information structures , 1991, Proceeding Visualization '91.

[7]  B. Schneiderman,et al.  Designing the User Interface. Strategies for Effective Human-Computer Interaction , 1992 .

[8]  Ben Shneiderman,et al.  Readings in information visualization - using vision to think , 1999 .

[9]  Kerry Rodden,et al.  Does organisation by similarity assist image browsing? , 2001, CHI.

[10]  Eoin Brazil,et al.  Sonic browsing: An auditory tool for multimedia asset management , 2001 .

[11]  Gary Marchionini,et al.  Information Seeking in Electronic Environments , 1995 .

[12]  Tamara Munzner,et al.  H3: laying out large directed graphs in 3D hyperbolic space , 1997, Proceedings of VIZ '97: Visualization Conference, Information Visualization Symposium and Parallel Rendering Symposium.

[13]  Ben Shneiderman,et al.  Tree visualization with tree-maps: 2-d space-filling approach , 1992, TOGS.

[14]  John J. Bertin,et al.  The semiology of graphics , 1983 .

[15]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[16]  George Tzanetakis,et al.  Automatic Musical Genre Classification of Audio Signals , 2001, ISMIR.

[17]  William W. Gaver What in the World Do We Hear? An Ecological Approach to Auditory Event Perception , 1993 .

[18]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[19]  Ben Shneiderman,et al.  Designing the User Interface: Strategies for Effective Human-Computer Interaction , 1998 .

[20]  Emanuel G. Noik,et al.  Encoding Presentation Emphasis Algorithms for Graphs , 1994, GD.

[21]  Jonathan Foote,et al.  An overview of audio information retrieval , 1999, Multimedia Systems.

[22]  Ben Shneiderman,et al.  The alphaslider: a compact and rapid selector , 1994, CHI Conference Companion.

[23]  Ramana Rao,et al.  A focus+context technique based on hyperbolic geometry for visualizing large hierarchies , 1995, CHI '95.

[24]  George Tzanetakis,et al.  MARSYAS: a framework for audio analysis , 1999, Organised Sound.

[25]  Pat Hanrahan,et al.  Interactive visualization of large graphs and networks , 2000 .

[26]  Toshiyuki Masui LensBar-visualization for browsing and filtering large lists of data , 1998, Proceedings IEEE Symposium on Information Visualization (Cat. No.98TB100258).

[27]  George Tzanetakis,et al.  MARSYAS3D: A PROTOTYPE AUDIO BROWSER-EDITOR USING A LARGE SCALE IMMERSIVE VISUAL AND AUDIO DISPLAY , 2001 .