4.7.1 Performance
In section 4.1 we reviewed performance related work, and we find that performance variation across supposedly identical instances is widely reported on. It is notable that the reported variation is not limited to either specific instance types, providers or indeed types of workloads, and arguably constitutes a prima facie opportunity for a performance broker of the type discussed in section 1.2. However, based on extant work, it appears that different types of workloads are likely to offer different opportunities, as we discuss in more detail next.
Given the scale of resources available in Clouds much attention has been paid to the use of them for running HPC jobs, both tightly coupled latency sensitive codes (typically MPI) and loosely coupled embarrassingly parallel ones. Perhaps unsurprisingly, the former typically perform poorly when compared to running on on-premise systems with low latency networking, whilst the latter are generally considered suitable for running on Clouds. However, a significant number of studies, primarily due to the time at which they were conducted, do not consider newer instance types designed to alleviate latency issues. In particular we note the Magellan Report (Yelick, et al., 2011) compares m1.large instances on EC2 to dedicated on-premise HPC clusters. Arguably, instance types aimed at MPI codes cater for a specialist and somewhat niche need, indeed Iosup et al. (2007) reports that 97% of typical scientific clusters are used for independent but not unrelated batch jobs. It is not clear how the broker, whose concern is variation at the same price, is able to add value to current low latency offerings, and so we do not consider them further.
Similarly, whilst variation in I/O performance is reported on, it may be too unpredictable for a broker to address (Lietner and Cito, 2016). Further, we note that providers are beginning to offer instances with different levels of I/O performance at different prices. As a consequence, there appears to be few, if any, opportunities for a broker to offer value add with regards to I/O performance. However, based on extant work, there does appear opportunities for brokering the performance of CPU bound workloads.
Lietner and Cito (2016), O’Loughlin and Gillam (2013), Cerotti et al. (2012), Farley et al. (2012) and Lenk et al. (2011) report performance variation across instances of the same type with respect to a wide range of different CPU bound workloads. A common cause of this is heterogeneity, that is, instances of the same type have differences in performance due to having different CPU models. Arguably, given the discussion in section 3.4, this is to be expected, and we recall the Popek and Goldberg efficiency, or performance, property of hypervisors20: A statistically significant fraction of VM machine instructions must execute without VMM intervention i.e. directly on underlying CPU. CPU bound workloads predominantly execute arithmetic and logical instructions which do not require hypervisor intervention.
Notably, performance variation due to heterogeneity has a workload specific characteristic, namely, whilst different CPU models (backing the same instance type) are better/worse than each other, it is not possible to identify a best/worst CPU model for all workloads. Interestingly, Phillips et al. (2011) refer to this as an ‘anomaly’. Curiously, Lietner and Cito (2016) fail to note that variation is workload specific in their survey work, and further, Zhuang et al. (2013) assumes there is a ‘best’ performing CPU and concludes that users can game the system which will lead to resource starvation for this model. Clearly, this result needs re-appraising. An interesting consequence of performance variation due to heterogeneity is reported by Tordsson et al. (2011) and Schad et al. (2010), namely that there are Regional and AZ differences in performance on EC2 for the same instance types due to variations in the hardware. As a consequence, for the same price, different locations offer better/worse performance.
From the perspective of Cloud users, performance becomes probabilistic, in the sense that typically, it is typically not possible to know with certainty the CPU model that will be obtained when requesting a new instance. Further, we find no extant work addressing the question of dependence, that is, when satisfying a request for multiple instances, a common occurrence for providers, are the instances assigned to hosts in an independent or dependent manner?
Heterogeneity offers a clear brokering opportunity, as follows: assume that instances with the same CPU model have identical constant performance. The broker then ranks CPU models based on their performance, and sells instances based on knowledge of their CPU model only, with better performing CPU models attracting a higher price than lower ones.
However, instances with the same CPU model also display variation, although the degree of variation in homogeneous instances varies by workload. In particular, we note Ou et al. (2013) make use of benchmarks which fit into low level caches and so do not stress the memory hierarchy when executing, and as a result they report that variation across instances with the same CPU is negligible, whilst Farley et al. (2012) report a coefficient of variation (ratio of standard deviation to mean) of up to 15%. Indeed, looking ahead to section 5.9, we show how it is possible to severely degrade the performance of an instance. Notably, extant work reports that variation in instances with the same CPU is typically much smaller than overall variation.
Whilst differing workloads have their own specific characteristics, the following appears to be true: once an instance has been obtained, and its performance measured and CPU model identified, we can estimate a future performance range with a degree of confidence. Further, this estimated range will be significantly less wide than the possible range of the instance before it was acquired. This is crucial to our proposed performance broker.
In extant work, the typical methodology employed when measuring performance is cross-sectional, where a set of instances is obtained, measured and then released, which may be repeated at a number of intervals. Leitner and Cito (2016) report results of a longitudinal study of 15 instances over a 72 hour period and find them to have negligible variation over the time period. However, an assumption that all instances with the same CPU model have the same performance properties over time seems to be inconsistent with the noisy neighbour effect. We conclude that longitudinal performance properties of instances have not been fully characterised.
As discussed in section 1.4, we intend to investigate the performance broker through use of simulation, requiring, amongst other things, a model of instance performance. However, extant results are typically reported either in terms of per CPU mean and standard deviation, or speed ups with respect to mean performance. The characteristics of the distribution as a whole are not discussed or reported on, as a result, we find that whilst extant work demonstrates variation, identifies causes and reports summary statistics, this is arguably insufficient to allow for a realistic model of performance to be derived. This is crucial, as without this we cannot investigate broker profitability with accuracy. Further, we can find no examples of performance data sets which we can download and use for this purpose.
We will conduct a range of performance experiments which will allows us to (1) confirm existing conclusions and (2) generate an a data set which we can use to provide further characterisation of performance, which will be used to inform a model of instance performance. Based on the discussion in this section we propose the following hypotheses for our performance experiments:
H1: Performance variation is commonplace across different instance types and providers, and is found on homogeneous and heterogeneous instance types. Heterogeneity of instance types leads to variation and increased heterogeneity typically leads to increased variation.
H2: Heterogeneity typically produces consistent ranges of variation.
H3: Different workloads have different levels of variation due to differences in how they utilise underlying hardware.
Establishing hypotheses H1-H3 will confirm extant results. Hypotheses H4-H6 below are perhaps somewhat more speculative, however are still grounded in extant observations. In particular, H4 is based on observations and discussions of virtual machine scheduling, whilst H5 and H6 are based on the noisy neighbour effect.
H4: The allocation of CPU models to instances made within the same request is more irregular than extant assumptions allow for.
H5: Instances running on supposedly identical hardware, as identified by having the same CPU model, do not necessarily have the same performance levels over a given period of time.
H6: The actions of an instance can degrade the performance of its co-locating neighbour, the effect of which varies by workload.
The investigation of these hypotheses will form the subject of chapter 5.
4.7.2 Performance Improvement Strategies
In section 4.2 we reviewed extant work on performance improvement strategies. Ou et al. (2013) and Farley et al. (2012) propose strategies that exploit elasticity in order to improve performance, with the former focusing on batch jobs whilst the approach of the latter is more suitable for service oriented workloads. Using these strategies one can estimate the effective cost per unit of billable time for a particular performance, which serves as a pricing guide for a performance broker as it allows users to compare known prices that a broker can quote and an estimated one when instance seeking.
Further, and notably for our work, both Ou et al. (2013) and Farley et al. (2012) present a model of instance performance. In the former case the model simply assumes that the performance of instances with the same CPU is constant, and so there are no differences between instances with the same CPU model, and instances have no temporal variation in performance. The latter assumes variation exists, however all instances with the same CPU are assumed to follow the same normal distribution. As a consequence, these instances have the same mean performance level over time with the same degree of variation around it. However, this seems inconsistent with noisy neighbour effects, and there is no reason to expect that instances will have the same mean and variation in their respective performance over time simply due to having the same CPU model.
Finally, both Ou et al. (2013) and Farley et al. (2012) make a number of implicit assumptions, namely the independence in the allocation of CPUs models to instances. This assumption would not be true, for example, when a provider implements a packing policy, as an instance will be scheduled onto the same host as the previous one if possible. In numerical simulations we find that this assumption leads to the same expected cost for instance seeking but with a greater variation i.e. risk.
There is a clear need for a performance model of instances that reflects empirical findings, and looking ahead, this will be the subject of section 6.4.
4.7.3 Brokers
In section 4.3 we reviewed extant work on brokers, with particular focus on performance. Gottschlich et al. (2014), Li et al. (2010) and Lenk et al. (2011) consider brokers offering performance comparison services and so the focus is on performance differences at different prices across different providers. However, we typically find no consideration that performance may differ at the same price on the same provider, with the exception of Lenk et al. (2011) who note that a broker may be able to expose differences in performance due to heterogeneity. Similarly Pawluk et al. (2012) proposes a broker that acquires resources based on stated requirements, one of which is performance. However, we again find that performance is insufficiently developed and does not fully reflect empirical findings. Interestingly, they suggest that their broker may employ Just in Time Benchmarking (JITB), and we note that this implies they will become, in effect, an instance seeker on behalf of clients. As such, our finding that seeking risk may be greater than is typically assumed is of relevance to this work.
However, we can find no example of a broker that specifically addresses the problem of variation at the same price or indeed one that operates by providing performance-assured instances.
In addition to a demonstration of profitability, the approach taken by Rogers and Cliff (2012) is interesting for us as we will also need to consider how a pool of instances is managed over time. They make use of a DES where sets of events take place at discrete points in time, and when completed the clock moves forward. After different multiple runs an estimate of expected profit is calculated. Looking ahead to section 6.10 we employ the same methodology.
4.7.4 Markets and Pricing
In sections 4.4 and 4.5 we reviewed markets and pricing. The commodity view of Cloud is well established: Buyya et al. (2009), Cartlidge (2014), Weinman (2015) and Garg et al. (2013). Commodities are typically exchange traded which has the benefit of continuous price discovery and liquidity, and most exchanges operate under a CDA allowing for continuous asks and bids to be posted onto a central order book for clearing. A commonly used model of agents operating within a marketplace is the Gode and Sundar (1993) zero intelligence constrained ZI-C agent, which under certain circumstances leads to a CDA finding the equilibrium price. In a refinement of this, Cliff and Bruten (1997) introduces profit driven intelligent ZIP agents and demonstrates how a CDA market populated by ZIP agents rapidly finds the equilibrium price. An over the counter (OTC) market provides for bespoke requirements that cannot be met on a commodity market due to the latter selling standardised goods only. However, this comes at the expense of price transparency and potentially liquidity. A feature of OTC markets is negotiation between brokers and clients, which is not found on exchanges.
4.7.5 Cloud Simulators
In section 4.6 we reviewed Cloud simulators, noting limitations with regard to performance making CloudSim unsuitable for our specific problem without extension. Indeed, other CloudSim limitations led to development of CReST, however, for our specific problem we would also need to extend base classes, and in doing so accept the CReST Cloud view. We choose to write our simulation, and we discuss this choice further in chapter 8.
Arguably the most important aspect of the model of a broker is the performance of instances. In order to ensure validity of data assumptions it would be prudent to verify findings on reported variation. Further, performance experiments will allow for a fuller analysis and hence characterisation than would be possible if we simply relied upon extant results alone. In particular, we note, for example, results are predominately reported in terms of mean and standard deviation only. In addition, we can also address the issue of temporal performance of instances with the same CPU. In the next chapter we discuss and present the results of our empirical work.
Dostları ilə paylaş: |