Media synchronization in distributed multimedia file systems

One of the unique features that distinguishes digital multimedia from traditional computer data is th e presence of multiple media streams (such as audio and video), whose display must proceed in a mutuall y synchronized manner . The design of techniques for synchronization of multimedia data at the time of storag e and retrieval from network file servers, which are essential in designing multimediaPon-demand servers ove r metropolitan area networks, is the subject matter of this paper . A multimedia object being stored on a network file server may he constituted of media component s that may be generated by different sources on the network . Furthermore, the media components may b e generated at different times, but their display may be required to be simultaneous (e .g ., audio dubbing i n movies) . Hence, it, is convenient to place all the media units constituting a multimedia object on a relative time scale, with the media units that are at the beginning of the object placed at zero on the scale . The position of a media unit on the relative time scale defines its relative time stamp (RTS) . Each media unit (such as a video frame or an audio sample) should be associated with a RTS . During retrieval, all thos e media units with the same RTS must be displayed simultaneously . Assignment of RTSs to media units is straight-forward in the absence of both rate mismatches betwee n 1/0 devices used for generation of media units and network jitter . However, in future integrated networks , I/O devices that lack the sophistication to run elaborate time rate synchronization protocols may be directl y (as opposed to via a host computer) to the network, e .g ., Etherphone, ISDN Telephone, etc . Furthermore , even in environments in which clocks are synchronized, compression may yield variations in sizes of medi a units, as a result of which the actual period of each media unit. may vary between p* (1 — c) and p * (1 + c) . where p is the nominal value and c the maximum fractional variation . The relative time scale is is based c m a master media source (all other sources are slave sources) whose choice is application dependent . The fi e server, when it receives media units from the master and slave devices, determines sets of media units tha t are generated within a tolerable window of asynchrony, A and assigns the the same RTS to media units that are in the same set . Such sets of media units are called synchronization sets : media units nm and n, fro m the master and slave devices, respectively, can form a synchronization set iff their generation times g(n,,, ) and g(n,) are such that Ig(nm) g(n,)I <,A . Owing to non-deterministic variations in transmission delays (which are assumed to be bounded betwee n Amin and Amax) of media units, the exact generation times of media units are not known to the file server . Determination of synchronization sets is therefore based on the the file server's estimates of the earliest an d latest possible generation times of media units from the master and slave sources . When media unit n i s received by the file server at time r, the file server can determine the earliest and latest possible generatio n times of media unit n, denoted by g e (n), and g t (n), t .o be : g e (n) = — ,mar, and g t (n) = r — z min . Give n the generation intervals [gn,(nm),g ;,ti(n,,,)] and [g ;(n,),g;(n,)] of the latest media units n m and n, receive d from the master and slave sources, respectively, the file server can determine that ti,,, and n, belong to a synchronization set if 6 gmar( n m, n ,) < .A (1 )