Effective browsing of long audio recordings

Timeliner is a browser for long audio recordings and features that it derives from such recordings. Features can be either signal-based, like spectrograms, or model-based, like categorical classifiers. Unlike conventional audio editors, Timeliner pans and zooms smoothly across many orders of magnitude, from days-long overviews to millisecond-scale details, with zero latency, zero flicker, and low CPU load. Also, to suggest which details are worth zooming in to examine, Timeliner's agglomerative hierarchical caches propagate feature-specific details up to wider zoom levels. Because these details are not averaged away, "big data" can be browsed rapidly and effectively. Several studies demonstrate this.

[1]  Sylvain Meignier,et al.  LIUM SPKDIARIZATION: AN OPEN SOURCE TOOLKIT FOR DIARIZATION , 2010 .

[2]  Eitan Grinspun,et al.  Multiscale texture synthesis , 2008, ACM Trans. Graph..

[3]  Paul S. Heckbert,et al.  Fundamentals of Texture Mapping and Image Warping , 1989 .

[4]  Barry Arons,et al.  SpeechSkimmer: a system for interactively skimming recorded speech , 1997, TCHI.

[5]  Thomas S. Huang,et al.  Feature analysis and selection for acoustic event detection , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  Marijn Huijbregts,et al.  Segmentation, diarization and speech transcription : surprise data unraveled , 2008 .

[7]  P. Mermelstein,et al.  Distance measures for speech recognition, psychological and instrumental , 1976 .

[8]  Steve Young,et al.  The HTK book , 1995 .

[9]  Xu Chen,et al.  Comparative performance analysis of time-frequency distributions for spectroscopic optical coherence tomography. , 2004, Applied optics.

[10]  R. Shepard Circularity in Judgments of Relative Pitch , 1964 .

[11]  Rik Van de Walle,et al.  A new approach to combine texture compression and filtering , 2011, The Visual Computer.

[12]  Thomas S. Huang,et al.  Improving faster-than-real-time human acoustic event detection by saliency-maximized audio visualization , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[13]  Jui Ting Huang,et al.  Multimodal speech and audio user interfaces for K-12 outreach , 2011 .

[14]  Adrien Bousseau,et al.  Dynamic solid textures for real-time coherent stylization , 2009, I3D '09.

[15]  Thomas S. Huang,et al.  Real-world acoustic event detection , 2010, Pattern Recognit. Lett..

[16]  Lance Williams,et al.  Pyramidal parametrics , 1983, SIGGRAPH.

[17]  Rosen Glenn Global Exploratory Analysis of Massive Neuroimaging Collections using Microsoft Live Labs Pivot and Silverlight , 2010 .