A sensitivity study of the clustering approach to workload modeling (extended abstract)
暂无分享,去创建一个
In a paper published in 1984 [Ferr84], the validity of applying clustering techniques to the design of an executable model for an interactive workload was discussed. The following assumptions, intended not to be necessarily realistic but to provide sufficient conditions for the applicability of clustering techniques, were made:The system whose workload is to be modeled is an interactive system, and its performance can be accurately evaluated by solving a product-form closed queueing network model.
The behavior of each interactive user can be adequately modeled by a probabilistic graph (called a user behavior graph); in such a graph, each node represents an interactive command type, and the duration of a user's stay in the node probabilistically equals the time the user spends typing in a command of that type, waiting for the system's response, and thinking about what command should be input next.
The interactive workload to be modeled is stationary, and the workload model to be constructed is intended to reproduce its global characteristics (not those of some brief excerpt from it exhibiting peculiar dynamics), hence to be stationary as well.
It was shown in [Ferr84] that, under these assumptions, clustering command types having the same probabilistic resource demands does not affect the values of the performance indices the evaluators are usually interested in, provided the visit ratio to each node in the reduced (i.e., post-clustering) user behavior graph is equal to the sum of the visit ratios the cluster's components had in the original graph.
Since the reduction we have just described is equivalent to replacing each cluster with one or more representatives of its components, and since this is also the goal of applying clustering techniques to the construction of executable workload models substantially more compact than the original workload to be modeled, this result shows that such techniques are valid (i.e., produce accurate models) when the assumptions and the conditions mentioned above are satisfied.
One condition which in practice is never satisfied, however, is that the clustered commands are characterized by exactly the same resource demands. In fact, clustering algorithms are non-trivial just because they have to recognize “nearness” among commands with different characteristics, and group those and only those commands whose resource demands are sufficiently similar (where the notion of similarity is to be defined by introducing that of distance between two commands). Thus, the question of the sensitivity of a workload model's accuracy to the inevitable dispersion of the characteristics of a cluster's components immediately arises. We know that, if an adequate product-form model of an interactive system can be built, if the users' behaviors can be accurately modeled by probabilistic graphs, and if the workload and the model of it to be constructed are stationary, then a workload model in which all commands with identical characteristics are grouped together and modeled by a single representative is an accurate model of the given workload (i.e., the model produces the same values of the performance indices of interest as the modeled workload when it is processed by a given system). This is true, of course, provided the visit ratios of the workload model's components equal the sums of those of the corresponding workload components. If we now apply a clustering algorithm to the given workload, thereby obtaining clusters of similar, but not identical, commands, and we build a workload model by assembling cluster representatives (usually one per cluster, for instance with demands corresponding to those of the cluster's center of mass), by how much will the values of the performance indices produced by the workload model running on the given system differ from those produced by the workload to be modeled?
As with several other problems, this could be attacked by a mathematical approach or by an experimental one. While a successful mathematical analysis of the sensitivity of the major indices to the dispersion in the resource demands of the commands being clustered together would provide more general results, it would also be likely to require the introduction of simplifying assumptions (for example, having to do with the distributions of the resource demands in a cluster around its center of mass) whose validity would be neither self-evident nor easy to verify experimentally.
On the other hand, an experimental approach achieves results which, strictly speaking, are only applicable to the cases considered in the experiments. Extrapolations to other systems, other workloads, other environments usually require faith, along with experience, common sense, and familiarity with real systems and workloads. This inherent lack of generality is somehow counterbalanced, however, by the higher degree of realism that is achievable with an experimental investigation. In particular, when in a study the properties of workloads are to play a crucial role (there are very few studies indeed in which this is not the case!), using a mathematical approach is bound to raise about such properties questions that are either very difficult or impossible to answer. Primarily for this reason, and knowing very well the limitations in the applicability of the results we would obtain, we decided to adopt an experimental approach.
Since the question we were confronted with had never been answered before (nor, to our knowledge, had it been asked), we felt that our choice was justified by the exploratory nature of the study. If the resulting sensitivity were to turn out to be high, we could conclude that not even under the above assumptions can clustering techniques be trusted to provide reasonable accuracy in all cases and hence should not be used, or used with caution in those cases (if they exist) in which their accuracy might be accept able. If, on the other hand, the sensitivity were low, then we could say that, in at least one practical case, clustering techniques would have been shown to work adequately (of course, under all the other assumptions listed above).
The rationale of this investigation might be questioned by asking why it would not be more convenient to test the validity of clustering techniques directly, that is, by comparing the performance indices produced by a real workload to those produced by an executable model (artificial workload) built according to a clustering technique. Our answer is that, in this study as well as in [Ferr84], we are more interested in understanding the limitations and the implications of clustering and other workload model design methods than in evaluating the accuracy of clustering in a particular case. In other words, we are not so much keen on finding out whether the errors due to clustering are of the order of 10% or of 80%, but we want to be able to understand why they are only 10% or as large as 80%, respectively. Thus, we need to decompose the total error into the contributions to it of the various discrepancies that any real situation exhibits with respect to the ideal one. This paper describes a study primarily performed to assess the magnitude of one such contribution, that of the dispersion of the resource demands of clustered commands.
An experimental approach, in the ease being considered here, requires first of all that a workload for the experiment be selected. Then, that workload is to be measured, in order to obtain the values of the parameters defined by the desired characterization.
Next, an executable workload model is to be built by applying a clustering technique to the real workload selected. Then, the workload and its model are to be run on the same system, so that the model's accuracy can be evaluated by comparing the performance indices produced by them. As our study is to try to isolate the sensitivity of that accuracy to the differences in demands among the commands that have been grouped into the same cluster, these differences must be made the only source of inaccuracies in the performance produced by the model. To isolate this contribution to the error from all of the others, the latter sources should be eliminated. Finally, the experiment is to be carried out, and its results interpreted. The results show that, on the whole, the clustering method for workload model design is reasonably accurate in the context of the case examined in our study. The sensitivities we found were reasonably low. Thus, we can state that, in at least one practical case and under the assumptions discussed in this paper, clustering techniques for executable workload model design have been shown to work well.