A performance Brokerage for Heterogeneous Clouds

Performance Improvement Strategies

Yüklə 1,38 Mb.

səhifə	18/49
tarix	09.01.2019
ölçüsü	1,38 Mb.
	#94329

1 ... 14 15 16 17 18 19 20 21 ... 49

4.2 Performance Improvement Strategies

As discussed in section 3.2, Cloud providers recommend that performance of instances should be measured before workloads are deployed, and indeed, Mouline (2009) suggests that the measuring of Cloud performance should be conducted ‘…before deployment and continually in production…’. Pro-actively measuring instance performance is the approach followed by Netflix¹⁷, who measure and monitor performance, before and during deployment, looking for variance and outliers. Implicitly, the purpose of this monitoring is to identify ‘poor’ performing instances, terminate them and acquire a suitable replacement. It is of course the elasticity of Cloud services that makes such an approach possible, but brings additional costs as instances are potentially benchmarked and terminated without having done any useful work. Whilst, as far as we are aware, Netflix does not make public any quantification of the effectiveness of this strategy, such as the overheads incurred for the performance gained, arguably they would not engage in such a strategy if it was not effective for their needs.

The Netflix strategy is variously called ‘Ditch and Deploy’, ‘Instance Seeking’ or ‘Placement Gaming’. We use the term performance improvement strategy. The dynamic acquiring and releasing of resources with the objective of improving performance comes at a cost, and so these strategies are a trade-off between potential performance gains and costs incurred. The purpose of this section is to describe extant strategies, as considered by Ou et al. (2013) and Farley et al. (2012), but also examine their modelling assumptions, in turn re-evaluating their risk, by which we mean the standard deviation around the expected cost of obtaining a particular performance level
A ‘Deploy and Ditch’ strategy proceeds as follows: (1) acquire an instance and incur a cost equal to the provider’s unit of billable time (2) check performance. If performance is sufficient, deploy the instance into production, else terminate the instance. This process repeats until the required number of instances with sufficient performance has been acquired. By performance_overhead we mean the cost incurred until an instance with a required performance level has been obtained. For example, if a user has acquired and terminated 3 instances before a suitable instance is found, then performance_overhead is, at a minimum, the cost incurred by those 3 instances. However, should we include the cost of measuring instance 4? Arguably this will depend upon the minimum billing from the provider. For example, EC2 previously billed in per hour multiples, and so measuring times during which the instance was not doing any useful work has typically been ignored. However, with a move to per minute billing arguably such measuring costs should be included.
In Algorithm 1 shows how performance_overhead is calculated (assuming per minute billing and includes the cost of measuring the required instance).
Algorithm 1: Ditch and Deploy

obtained ← 0

performance _overhead ← 0

while obtained < 1:

instance = cloud.request_instance()

if instance.performance_is_sufficient == TRUE

obtained ← obtained + 1

performance_overhead ← performance_overhead+ instance.cost()

else:

cloud.terminate_instance(instance)

performance_overhead ← performance_overhead+ instance.cost()

When seeking n instances, the user begins by requesting n, and if k have sufficient performance then these are deployed with the remaining n – k terminated or ‘ditched’. In the next iteration n – k are requested.

The performance_overhead cost is incurred up front, but is amortized over the instance’s lifetime. The effective cost per unit of billable time for an instance obtained through instance seeking with a duration of T minutes, and assuming a per minute billing denoted by price, is defined as:
effective_cost := price + performance_overhead/T 4.1
Clearly, the longer the instance is run for, the lower the effective per minute costs.
Ou et al. (2013) make use of deploy and ditch to reduce the execution time of a fixed workload, and hence reduce workload execution costs. They seek on CPU model rather than a measured performance level. However, their analysis makes a number of assumptions which deserve further consideration. They assume that performance variation of instances with the same type of CPU model is negligible and so all instances with the same CPU have the same constant performance with no variation over time. However, in section 5.2, 5.6 and 5.9 we demonstrate that large performance variation can exist for instances of the type on the same CPU model. As a result of their assumption there is no cost variation for executing the same workload across instances with the same CPU model, which would not be the case.
There is a further implicit assumption regarding how instances are allocated, or scheduled, onto hosts. In particular, Ou et al. (2013) assume scheduling of instances is done independently, so that the CPU model that an instance obtains is independent of any already obtained. There is no reason to assume that all providers schedule in this manner. Indeed, Google (Verma et al., 2015) refers to the case when scheduling their workloads independently across machines as ‘worst fit’, whilst ‘best fit’ is when workloads are packed as tightly as possible. With a packing policy, instances are likely to be packed together and so have the same CPU model.
If we assume both independence and a fixed probability of obtaining an instance on a specific CPU model of p > 0, then instance seeking becomes a negative binomial trial. Curiously, when referring to the cost of obtaining ‘best’ instances, Ou et al. (2013) do not make clear they are using an expected cost, and no variance is reported. Standard deviation (the square root of variance) is a commonly used measure of risk in finance, and by not reporting it an impression is given that instance seeking is risk free.
Farley et al. (2012) consider an approach more suitable for service oriented workloads: if n instances are required for T hours (with an hour assumed to be the unit of billable time), then in the first hour start n + k, instances where k > 0. At the end of the first hour ditch the worst k instances and keep the remaining n in service for the next (T – 1) hours. The performance_overhead in this case is fixed and is amortized over the T hours, with an objective that price/performance is improved upon compared to mean price/performance. Unlike Ou et al. (2013), instances are not ditched immediately, but rather at the end of their first hour and so they make a contribution to the work done. Note that both Farley et al. (2012) and Ou et al. (2013) assume per hour billing in their models and so ignore any time lost for measurement or CPU identification in the

Farley et al. (2012) model per CPU performance as a normal distribution, with performance in any given hour assumed to be independent from previous hours. Instances with the same CPU have the same mean performance and also the same variance, which seems somewhat implausible. Suppose we have instance A and instance B on different hosts, but with the respective hosts having the same CPU model. Suppose instance A has no neighbours and so experiences no resource contention. We would expect instance A to obtain the best possible performance the host is capable and configured to deliver to it. Further, suppose that instance B is on a host which has the maximum configured number of co-locating instances and that each of these instances are running resource intensive workloads. Given reported performance degradation due to resource contention we would expect the average performance of instance B to be less than that of A and more variable.

As both Ou et al. (2013) and Farley et al. (2012) assume independence of CPU model when requesting instances, it is interesting to consider the effect on performance_overhead when we don’t make this assumption. Consider a provider offering instances on CPU1 and CPU2 at a price of $1 per unit of billable time with a scheduling policy such that whenever n > 0 instances are requested they are all scheduled onto either CPU1 or CPU2. Further, assume there is a probability p > 0 of obtaining CPU1 and q = 1 – p for CPU2. We can estimate seeking costs of 20 instances on CPU1 by using a Monte Carlo (MC) simulation with 1,000,000 runs of algorithm 1. Doing so, we find a mean of $40.0, a standard deviation of $28.2 and an estimate of the MC error (the sampling error) of 0.0282. We find then the same expected cost as the case of independence but a significant increase in variation i.e. risk.
Whilst risk provides opportunities, it is essential that it is fully characterized before strategies designed to exploit it are employed. Arguably, however, it may be preferable for users to make use of performance based services as could be offered by brokers, and let them assume the risk. In the next section we review work on brokers - both those that propose performance-based services, and more general ones.

Yüklə 1,38 Mb.

Dostları ilə paylaş:

1 ... 14 15 16 17 18 19 20 21 ... 49