A performance Brokerage for Heterogeneous Clouds

Yüklə 1,38 Mb.

səhifə	31/49
tarix	09.01.2019
ölçüsü	1,38 Mb.
	#94329

1 ... 27 28 29 30 31 32 33 34 ... 49

5.8 Performance Correlations

In section 5.7 we noted a number of instances that appeared to ‘move’ to a common stimulus. This raises the question of whether or not we can find correlations between instances running the same set of benchmarks. A priori, our expectation is for instances to be either ‘good’, ‘bad’ or ‘indifferent’ for all workloads ‘equally’. In addition to the 200 C4 instances benchmarked in section 5.6 we also benchmark 200 M4 instances.

The Pearson Correlation Coefficient is commonly used for determining the degree of linear association between two data sets; however it assumes that both sets of data are normally distributed. As we have seen in sections 5.2 – 5.4 however, performance is not normally distributed and so we consider commonly used alternative measures of correlation: Spearman’s Rho and Kendall’s Tau, neither of which makes this assumption. Both Spearman’s Rho and Kendall’s Tau measure the ordinal, or rank, correlation. If we were to rank instances by order of their performance with respect to one workload, and then repeat the exercise for another workload, both correlation coefficients measure the degree of association between the respective orderings. Both coefficients are in the range [-1, 1]; with -1 indicating perfect negative correlation, 0 indicates no correlation and 1 perfect positive correlation. Croux and Dehon (2010) show that Kendall’s Tau is more robust (less sensitive to outliers), and so we use it to measure correlation.
In Table 22 below we present the results for each pair of compute bound workloads for both the C4 and M4 instances. Note values are presented in the order of GNUGO__Hmmer__NAMD'>(C4, M4).
Table : Performance correlations for GNUGO, Hmmer, NAMD, pbzip2, POV-Ray and sa-learn for C4 and M4 respectively. We note how different workloads have different degrees of correlation. We highlight examples of strong correlation and note that this may be useful to the broker as it may be possible to reduce the number of benchmarks being used. That is, performance of one workload can be used to predict performance of another.

(C4, M4)	GNUGO	Hmmer	NAMD	pbzip2	POV-Ray	Sa-learn
GNUGO		0.67, 0.32	0.59, 0.37	0.62,0.66	0.47, 0.45	0.69,0.4
Hmmer	0.67, 0.32		0.45, 0.38	0.4, 0.38	0.35, 0.18	0.75, 0.68
NAMD	0.59, 0.37	0.45, 0.38		0.32, 0.53	0.85, 0.41	0.5, 0.33
Pbzip2	0.62, 0.66	0.4, 0.38	0.32, 0.53		0.17, 0.32	0.42, 0.39
POV-Ray	0.47, 0.45	0.35, 0.18	0.85, 0.41	0.17, 0.32		0.41, 0.19
Sa-learn	0.69, 0.4	0.75, 0.68	0.5, 0.33	0.42, 0.39	0.41, 0.19

We note that there are no negative correlations, which would indicate a ‘large’ number of instances that were better/worse for one benchmark but worse/better for another. We would perhaps expect this. Typically correlation is stronger on C4 instances where we find, for example, correlation of 0.85 between POV-Ray and NAMD. In general however, correlation is typically moderate to strong on C4 whilst weak to moderate on M4. Interestingly, and perhaps somewhat surprisingly, we find no evidence of stronger correlations amongst benchmarks of the same type i.e. floating point or integer. Indeed, Hmmer, a floating-point benchmark has strong correlation on C4 instances with the integer benchmarks GNUGO and sa-learn and yet only moderate correlation with the floating-point benchmarks NAMD and POV-Ray.

Kendall tau (and similar) coefficients can’t reveal all the characteristics of correlations, and one of the most useful tools is the scatter-plot. We present a selection below for the C4 instances.

Figure : Example scatter-plots demonstrating asymmetric performance correlation amongst C4 instances. That is, correlation is typically stronger for better performance.
We can observe from the scatter-plots that correlation is not necessarily symmetric, with asymmetric correlation for pbzip2 and GNUGO, with stronger correlation amongst better performance. This means that an instance performing well for one of these workloads is likely to perform well for the other. However, as we have weaker correlation at the other side of tail, from poor performance of one workload we cannot with confidence infer performance of the other. It is natural to ask why we haven’t observed strong symmetric correlation for all benchmarks as we would a priori expect. We can interpret the results in (at least) two different ways: (1) instances experience large variation in noise as they are being benchmarked resulting in differences in workload rankings or (2) instances experience minimal variation in noise as they are being benchmarked, however different ‘types’ of noise affects different workloads differently.
In the next section we investigate the effect of noise from co-locating instances.

Yüklə 1,38 Mb.

Dostları ilə paylaş:

1 ... 27 28 29 30 31 32 33 34 ... 49