20.
A review of Cloud performance in chapter 4 revealed that variation in the performance amongst instances of the same type is commonplace, and is reported on for a range of different workloads, from latency sensitive MPI codes, I/O bound workloads to CPU bound workloads. Further, the problem is not limited to either particular types of instances or providers. Variation presents an opportunity for a broker. However, as discussed in section 4.7, the brokering of CPU bound workloads appears most readily achievable as the variation is in large part due to differences in underlying hardware. From results presented in 5.2 to 5.4 we also observe that brokering of memory bandwidth performance is also possible, for the same reason.
However, extant results on performance in the literature were insufficient as a basis for developing a realistic model performance, as they were typically limited to reporting mean and standard deviation or speed ups relative to various means. Further, performance models employed in extant performance improvement strategies contain a number of assumptions that are not supported by empirical evidence, as shown in section 5.5. As such, we have a requirement to develop a model of instances performance, which necessitated extensive benchmarking of Public Clouds, the subject of chapter 5. In section 5.10, we determined a number of features that the model should include based on the results of the benchmarking.
However, to model the profitability of broker requires more than instance performance. We must also, at a minimum, consider the marketplace where the broker acquires instances, the sellers on the marketplace, and the clients of the brokers. The rest of this chapter is organized as follows. In section 6.1 we provide a high-level overview of the broker and the marketplace it is operating within. In section 6.2 we construct a commodity exchange for Cloud resources through which multiple providers can sell appropriately equivalent instances, and in section 6.3 we discuss the sellers on the exchange with particular focus on the heterogeneity of their infrastructure and how they allocate instances to hardware with different CPU models.
In section 6.4 we model the performance of instances, taking into account the discussion of model features in section 5.10. Based on this, in section 6.5 we describe how the broker divides the observed performance range into so-called tranches, with each one specifying minimum performance levels. We also describe how the broker randomly generates prices between the price of the instance at the exchange and a higher price determined by its performance speedup. In this way the broker is acting as a Gode and Sunder (1993) ZI-C trading agent.
In section 6.6 we use a workload trace released by Google (Reiss et al., 2011) to construct a population of users with performance needs. Clients of the broker express a performance requirement by requesting a particular tranche. We discuss how we determine a limit price i.e. a quoted price from the broker that will be rejected as being too high. Sections 6.7 through to 6.9 discuss the final technical pre-requisites for our simulation i.e. our model in action. In section 6.10 we demonstrate that the broker can be profitable under certain conditions relating to variation found, minimum lease times on the exchange and quantity of demand.
Finally, in order to motivate discussion throughout this chapter we introduce a hypothetical client of the broker: FinRisk, a financial services company that each night runs simulations to estimate risk in their trading positions. To ensure simulations finish on time the client requires instances of a minimum performance and uses the broker services to acquire them, saving them time and effort in acquiring them themselves. We begin with a high-level discussion of the broker and in particular some of the choices we make in its operation.
The broker operates by maintaining a pool of instances rented from a compute marketplace in expectation of some level of demand. The broker measures the performance of instances in the pool, re-prices them, and offers them for sub-let to its clients32. At any given moment in time some of these instances are available for sub-letting whilst others are being sub-let. At a minimum, the operational cost of maintaining the pool will include the rental charges for the instances. In practice, the broker will incur additional business costs of business, which may potentially include transaction fees at the exchange.
In order to cater for a wide range of needs, performance is measured with respect to a selection of benchmarks and these are publically available, meaning all clients may express a performance requirement with respect to them. We refer to these as public benchmarks. FinRisk has determined that the performance of its simulations correlates well with benchmark workload 1, and so when ordering instances requires that performance has been measured with respect to that. In addition, we envisage clients registering benchmarks for their private use, but for simplicity do not include them in the model. For example, FinRisk could provide a kernel of its simulation for such purposes. We set the number of public benchmarks to 31, a number chosen to match the size of the SPEC CPU 2006 suite.
For each instance in the pool the broker maintains a performance history with respect to each public benchmark by periodically re-measuring its performance. Aggregating measurements across all instances in the pool allows the broker to maintain a ‘global’ performance history per benchmark. For each benchmark its global performance history is divided into so-called tranches, A, B, C…and so on, where each tranche specifies a minimum performance level. Whenever a client requires instances it places an order with the broker specifying: (num_instances, workload, tranche), where num_instances is the number of instances and tranche is a minimum performance level with respect to workload. For example, as noted, FinRisk requires instances measured with respect to workload 1, and so places orders such as (10, 1, A), meaning 10 instances whose performance is tranche A when measured with respect to workload 1. If the broker has sufficient instances available with the required performance then these are allocated to the client.
When receiving an order for (num_instances, workload, tranche) the broker responds by quoting a price which is determined by the performance of the instances with respect to workload and the workload’s global performance history. As we are pricing with respect to different workloads, which have different performance characteristics, there is potential for clients to game the broker. For example, suppose a client knows that instances capable of delivering tranche A performance for workload one is also capable of delivering tranche A for workload two, and that due to its characteristics pricing for workload one is cheaper than two. In this case the client will order instances with respect to workload one for use with workload two. We discuss gaming in more detail in 6.10.
Clients accept or reject quotes by comparing prices offered by the broker for particular performance tranches, with expected effective cost they would pay should they choose to instance seek for the performance level. This serves to put a realistic limit on what clients would be prepared to pay for given performance levels. For example, FinRisk may have determined that through performance improvement strategies, such as those discussed in section 4.2, they can acquire instances of sufficient performance with an effective cost of +10% for their intended duration. As such, an attractive price for FinRisk is anything below this.
The broker provides the client with exclusive access to instances they are sub-letting, and we say that the client has leased an instance from the broker. As there are no usage limits clients such as FinRisk can terminate the lease on their instances at any point. When a client terminates a lease the broker is free to sub-let the instance to other clients and over its lifetime an instance may be sub-let to one or more clients. In order for an instance to be profitable, the revenue the broker generates from the various sub-lets must exceed the charge incurred from renting the instance at the marketplace. In practice, in order to be profitable overall the revenue generated from the pool must cover all operational costs incurred.
On-demand provisioning is arguably the unique and compelling Cloud characteristic, and so we choose to implement an on-demand performance service. The broker provides a value-add on the standard on-demand market. On-demand means that clients can order instances without prior notification or usage commitments, although minimum charges apply, and can terminate the lease at any point. By contrast, a reservation based system such as those based on the WZH model, and discussed in section 4.3, require the submitting of probabilities of future need, non-refundable deposits to be paid and specified lease durations. However, unlike typical reservation based systems, on-demand services do not guarantee to be able to fulfill all orders, with EC2, for example, only guaranteeing capacity for reserved instances. Similarly, our broker does not guarantee order fulfillment, but in order to be a useful service the broker will need to fulfill some proportion of orders. We consider this a Quality of Service (QoS) condition.
On-demand services by definition require immediate fulfillment. On a financial exchange, an order that requires immediate execution typically has a time to live (ttl) of a few seconds. What should immediate execution mean for the broker? Arguably, this should mean there is no time for the broker to rent more instances, which Mao and Humphrey (2012) report can take anywhere from 1 to 5 minutes, and then measure their performance requiring additional time. Further, there are no guarantees over the performance of additional instances and so no guarantees of how long it will take to fulfill a request. For our purposes, immediate execution means that the broker can only satisfy requests from instances available in the pool.
Our requirement that the broker operates by satisfying requests out of the pool and does not require usage commitments means that the broker acquires on-demand instances only. There is of course potential for being more profitable by using instances from the spot market, as they are priced lower but offer the same performance level. However, this introduces a number of complications for the broker. Firstly, in a spot market there is no guarantee of availability, resulting in issues for the broker with regards to acquiring instances in anticipation of demand. In particular, the broker may not be able to acquire instances either quickly enough or in sufficient quantity to meet demand resulting in missed opportunities to profit as well reputational issues over service reliability. The spot market also introduces the further issue of instance re-claim, in particular whether or not the broker would be liable for any potential losses their clients suffer when an instance is re-claimed. Whilst we do not consider spot instances further in regards to our broker, in further work we do intend to investigate their possible use.
The broker cannot provide guarantees over future performance; however, performance risk at the point of sale is considerably greater than that of an instance that has been obtained and whose performance has been measured. As such, the broker can provide strong assurances over future performance. There is still risk on both sides: The client may pay for an instance whose performance subsequently declines. However as there are no usage commitments the lease can be terminated and the instance returned to the broker – limiting the downside risk for the client. Further, if performance improves, the client benefits from this with no increase in price. However, the broker cannot reclaim the instance and re-sell at a higher price. Operating an on-demand performance service means the broker assumes more risk than clients.
For FinRisk, performance assurances reduces the risk that their simulations will not finish on time, or a reduced number will. In turn, this helps to ensure the risk in their trading positions can be estimated with the required degree of confidence – a valuable service.
Given the prevalence of the commodity view, and indeed the seemingly inevitable market pressures that lead to commoditisation, we assume there is a Cloud Exchange (CeX) through which multiple providers sell instances meeting a particular specification, and from which our broker rents instances. The broker is re-selling on a secondary market. Figure 10 below provides a high-level graphical representation of the broker operation.
Figure : High level overview of the broker operation. A client submits a request to the broker, which then attempts to allocate appropriate instances. This is the secondary marketplace. In order to be able to fulfill on-demand requests the broker acquires them from the CeX, the primary marketplace.
Dostları ilə paylaş: |