Using empirical distributions to characterize Web client traffic and to generate synthetic traffic

We model a Web client using empirical probability distributions for user clicks and transferred data sizes. By using a heuristic threshold value to distinguish user clicks in a packet trace we get a simple method for analyzing large packet traces in order to get information about user off times and amount of data transferred due to a user click. We derive the empirical probability distributions from the analysis of the packet trace. The heuristic is not perfect, but we believe it is good enough to produce a useful Web client model. We use the empirical model to implement a Web client traffic generator. The characteristics of the generated traffic is very close to the original packet trace, including self-similar properties.