With the increasing demand for content based manipulation of ever growing stores of audio data and the emergence of MPEG-7 has come the need for structured audio representations. However, while the necessity of such a representation has been recognised and, to some extent, its essential features have been identified, its actual development and implementation have generally been relegated as problems for another time or person to solve. This paper attempts to address the shortfall by defining an audio structure that will allow content-based manipulation of audio at the level of audio objects. The paper then summarises the processes required to generate such a structure. Further, details are provided as to how the second level of this structure can be derived from a low-level perceptually based audio representation previously developed by the authors to satisfy the requirements at the lowest level of the audio structure. Finally, initial experimental results are presented.
[1]
Barry Vercoe,et al.
Structured audio: creation, transmission, and rendering of parametric sound representations
,
1998,
Proc. IEEE.
[2]
Brian Christopher Smith,et al.
Query by humming: musical information retrieval in an audio database
,
1995,
MULTIMEDIA '95.
[3]
Kathy Melih,et al.
Structured coding for content based interactive audio
,
1999,
Proceedings IEEE International Conference on Multimedia Computing and Systems.
[4]
Kathy Melih,et al.
Audio retrieval using perceptually based structures
,
1998,
Proceedings. IEEE International Conference on Multimedia Computing and Systems (Cat. No.98TB100241).