Due to their flexibility and elasticity, a diverse range of workloads are run on the Cloud. Indeed, from Cloud customer case studies we find workloads ranging from web-hosting, to computationally intensive video trans-coding, financial and bio-molecular simulations, to data driven analytics. In this section we discuss how performance variation can impact typical Cloud workload costs. We broadly characterise Cloud workloads as being service oriented or batch oriented, and we note a Netflix Technology Blog (Spyker, 2017) characterizing their workloads on EC2 in the same manner.
3.1.1 Impact on Service Oriented Workloads
A service oriented workload typically execute work on behalf of a set of clients, including web searches, messaging or on-line gaming. In a stateless service, instances operate independently to complete work. The Cloud provider Heroku, which offers a platform for workload execution, recommends that applications should be developed in a stateless manner where possible (Wiggins, 2017), and Wagner and Sood (2016) suggest that statelessness is an aid to building resilient Cloud services. Stateless services are often referred to as horizontally scalable as additional instances can be acquired on-demand to handle increase in service load. However, variation in performance potentially leads to variation in the number of instances required to complete the same amount of work per unit of time – and so costs vary. Indeed, marginal cost increases can be high.
To illustrate, consider the following example: Suppose there is a known load of 12000 Units of Work (UoW) that a service must deliver over the next hour, and the service is currently running at full capacity. Further, from past history it is known that instances of a particular type will consistently deliver 3000UoW per hour. In this case the scaling decision is straightforward – 4 instances are required to complete the work.
However, suppose instead that whilst the performance of individual instances is constant over time, differences in the performance of supposedly identical instances have been observed. Further, from past history we have observed a worst performance of 2800UoW/h and a best of 3200UoW/h. We assume that the performance of instances is uniform over this range, that is, the performance of an arbitrary instance is drawn from U[2800, 3200]. This is a 12.5% decrease in the amount of work per hour from best performing instance to worst, and looking ahead to chapter 5 we empirically demonstrate that variation far in excess of this is commonplace. At best, 4 instances are needed to complete 12000UoW/h, whilst at worst 5 instances are needed, and we have then a 25% marginal increase from lowest possible costs to highest. Assuming that the performance of instances are independent from each other, the expected number of instances needed per hour to complete the work is 4.5, and so the expected cost is 4.5 instance hours at the per hour rate.
Variation presents an opportunity for a broker as they can quote a fixed cost for instances, which a user can their compare to the expected cost should they attempt to find the instances themselves. It is a matter of common observation of human behaviour that when faced with choices which involve a degree of risk, the expected pay-out of each choice is not the sole consideration when deciding between them (Neumann and Morgenstern, 2007). Indeed, the well-known St Petersburg Paradox presents an example of a game with an infinite expected pay-out, and so if expected pay-out was the sole consideration amongst choices, players would be prepared to pay any fixed finite amount to enter the game. However, as noted, from common behavioral observation this is not the case. Such consideration led to Neumann and Morgenstern developing utility theory, which ranks preferences over a set of choices with potentially unknown outcomes. They show that so-long as the choices an individual makes are rational, i.e. conform to a specified set of axioms, then a utility function can be constructed in such a way that the individual’s choices appear to be maximizing their expected utility and not the expected pay-out.
Utility leads to the notion of risk-aversion, which for our purposes we describe as follows: a risk-averse individual may prefer a fixed known loss to a smaller expected loss, and the more risk-averse the individual the higher the risk-premium they are prepared to accept. Indeed, the prevalence of insurance demonstrates a common willingness to engage in loss-making games. With regards to a broker, suppose it has 4 instances (of the same type) that can deliver the required amount of work per hour i.e. 12000UoW/h. If the broker adds a 10% increase onto the provider price per instance, then we have a per hour cost of 4.4 instance hours, which is less than the expected cost of 4.5. In this case the broker is offering a fixed cost below the expected cost, which would be attractive to all risk-averse users and indeed some risk-takers. Indeed, even at a markup of 15%, the known cost of 4.6 instance hours is now slightly above the expected cost of 4.5, however, some risk-averse users will still prefer this higher known cost to the expected cost.
Impact of variation will of course depend upon its degree, but also in the scale of usage. In the example above, suppose the user requires 120,000UoW/h, so a ten-fold increase. With a variation of U[2800, 3200] across different instances the minimum number of instances required is 38 with a maximum of 43, whilst the expected number of instances required is 40.5. However, the probability that less than 40 instances will suffice is almost zero, similarly, the probability of needing more than 41 is also almost 0. Hence, with a probability close to 1 the costs of completing the work is either 40 or 41 instance hours. If the broker has 40 instances that can complete the work, then at a markup of 1% the broker can offer less than the expected cost of 40.5, whilst a 2% markup will appeal to some risk-averse users. In order to have higher margin requires the broker to have either 38 or 39 suitable instances, and we note again that the likelihood of obtaining this at random is small. Conversely, if the required amount of work is 3000UoW then at best 1 instance is sufficient and at worst 2, with an expected number of 1.5, offering high margins for a broker with 1 suitable instance whilst still being less than the expected cost.
3.1.2 Impact on Batch Jobs
The term batch job refers to a workload that can be run without user interaction, and batch processing refers to the execution of one or more batch jobs. Typically a batch job will progress at the fastest rate allowable by the resources, and knowing the rate of progression allows for an estimation of the execution time i.e. the time taken to complete the job. Clouds are popular for executing batch jobs as instances can be obtained on-demand and returned to the provider when the job has completed, with no further costs incurred. This makes workload execution costs transparent, and allows for users to choose between providers based on execution costs and time.
Whilst the Cloud is certainly suitable for running single batch jobs, it is likely that they are run in groups. Indeed, Iosup (2007) found that 96% of the total workload in large scale scientific compute clusters is comprised of various ‘groupings’ of batch jobs where individual jobs within a group typically execute independently of each other, however they are not typically unrelated as the output of each job is aggregated in some manner. Stochastic simulations, which are commonplace in scientific computing and the financial services industry, are an example of ‘grouping’ of independent but related batch jobs. Multiple batch jobs execute the same code independently but with inputs drawn from statistical distributions, and hence each execution of the job results in a different output. Once completed the output of the jobs is then aggregated and typically used as an estimate of the expected value of some statistic of interest.
However, performance variation will lead to differences in execution times of the batch jobs on different instances. A potential consequence of this is that not all jobs will finish within budget, and leaving all jobs to complete exposes users to unpredictable cost over-runs. Similarly, there is potential for, arguably, more serious issues through missing deadlines. Typically, simulations which estimate risk by calculating quantities such as the Value at Risk (VaR) or Expected Shortfall (ES) need to be completed in time for the next day’s trading. Performance variation may lead to some runs not completing on-time, reducing the accuracy of the simulation, with potential for financial loss due to underestimation of risk. Mitigation of this may require speculative execution of additional instances to ensure that with some degree of probability the required number of instances will complete executing the workload by the required deadline.
Finally, batch processing of data using large clusters of instances is commonplace on the Cloud, with Hadoop being particularly popular. Hadoop is an implementation of the MapReduce framework and schedules tasks for execution across instances in such a way that processing of data takes place on hosts where the data resides. However, differences in the performance of machines in a Hadoop cluster can adversely impact performance overall, that is, overall execution time. Zaharia et al. (2008) note that running Hadoop in heterogeneous Cloud environments, or even in homogeneous ones but with resource contention on the host, can cause severe performance degradation. When running Hadoop on the Cloud, there is potential for a small number of slower instances to slow the progression of the job as a whole, potentially requiring all instances to be run for longer. As such the cost impact of performance variation of a few instances becomes disproportionately high.
Dostları ilə paylaş: |