A performance Brokerage for Heterogeneous Clouds

A4: Risk Diversification and Instance Seeking

Yüklə 1,38 Mb.

səhifə	49/49
tarix	09.01.2019
ölçüsü	1,38 Mb.
	#94329

1 ... 41 42 43 44 45 46 47 48 49

Figure

A4: Risk Diversification and Instance Seeking

In section 4.9 we demonstrate that the degree of performance interference, or mutual degradation, experienced by co-locating instances depends upon the workloads being run. In our primary instance running pbzip2, the effect of running pbzip2 in all the neighbours was a 20% increase in execution times. However, when running sa-learn the increase was 7%, whilst in an extreme case running two STREAM triad processes in all neighbours resulted in a 450% increase in execution time. That ‘mutual interference’ varies by workload is in agreement with extant results: Delimitrou and Kozyrakis (2017) shows that different workloads co-locating on the same host have different levels of interference with each other.

That the ‘noisy neighbour’ is workload dependent is notable for a broker offering a workload specific performance service. The broker operates a pool of instances which are allocated to clients on the basis of their workload specific performance level. The broker wishes to avoid allocating an instance to one client which then adversely impacts the performance of instances allocated to other clients. This is particularly problematic if the latter instances are covered by an SLA as there is potential for degradation beyond the tranche level sold, resulting in an SLA violation and pay-out.
We find a useful analogy with so-called modern portfolio theory (MPT), which seeks to reduce risk i.e. variation in portfolio returns, through diversification so that portfolio asset returns are either independent or weakly correlated, whilst maintaining the level of expected return. Portfolio diversification results in asset returns that do not all move in the same direction to the same stimulus. One potential solution to the noisy neighbour problem is for the broker to implement a no neighbour policy. That is, the broker terminates any new instances added to the pool which are co-locating with extant instances. However, such a policy will drive up costs, although an opportunity does exist to minimise the degree of co-location in the pool whenever the broker scales the pool down.
Instead, suppose the broker receives a request for instances from a client for a particular workload at some tranche level. Ideally, the broker should allocate instances from the pool in such a way as to minimise impact on the performance of instances already sub-let - note that we cannot improve extant performance, only minimise degradation. At a minimum, the broker should try to ensure allocation of instances to one client does not result in an SLA violation for another.

In order achieve this the broker needs to be able to determine: (1) which instances may be correlated; and (2) degree of correlation between different workloads. For example, if the broker knows instances A and B are co-locating and that workloads w₁ and w₂have minimal correlation, then these instances can be ‘safely’ allocated to these workloads. We discuss how to identify co-locating instances next.

The problem of detecting co-locating instances has mainly been studied in the context of Cloud security. Zhang et al. (2012) shows that the sharing of an L2 cache between VMs was a possible vulnerability when it was demonstrated that one VM may extract cryptographic keys from another VM on the same host. Such an attack is known as an access driven side channel attack. Particularly noteworthy is the fact that the attack was demonstrated on an SMP system. In this case the challenge of core migrations i.e. the scheduling of a VM onto different cores during its lifetime, as would be encountered in a Cloud environment, needs to be overcome.
The vulnerability of a shared cache relies, in part, on exploiting hypervisor scheduling. Methods to increase the difficulty of successfully using such attacks are under development (Lui et al., 2014) and are already being integrated into Xen. Whilst such work mitigates fine-grained attacks, other attacks that seek to obtain a large share of the L2 cache are considered viable. In this case then, the intention is to be a noisy neighbour.
The potential to extract information between co-locating virtual machines has led to work on targeted attacks in the Cloud, where an attacker seeks to co-locate with a specific target. The attacker identifies the target via a publically available service (such as http) that they are running, and from this the IP address of the service can be determined. Such a targeted attack requires techniques for determining co-location with the target before the attack can be launched successfully. We classify techniques developed to date as: (1) Simple Network Based Probes; (2) Network Flow Watermarking; and (3) Cache Avoidance.
Ristenpart et al. (2009) conducts a number of network based probes including (1) ping round trip time and (2) common IP address of dom0. The latter technique works as follows: suppose we have two instances, A and B. In instance B we have a service running that any machine on the internet may connect to (for example an HTTP service). Instance A will launch a traceroute using a TCP probe against the open port on instance B. Note that, as ICMP echo replies (pings) may be disabled, the technique requires a running service on B. The traceroute command will output the list of IP addresses that have responded to the tcp probe.
The last IP address to respond is the address of the service. On Xen, the penultimate response is from the Xen hypervisor, which is the dom0. In this way, instance A may determine the IP address of the dom0 which is running and managing instance B. Similarly, instance A can launch such a probe against itself, and determine the IP address of the dom0 it is running on. If the two IP addresses are the same then the instances are running on the same host.
To test the veracity of these methods they also use access timings of shared drives. No details are provided of the type of drive being used (local or network) or how the disk is being shared. Whilst access times to shared drives may potentially be used for detecting co-locating siblings, there are a number of issues not discussed that demand further investigation.
Perhaps most important, is the widely reported variation in disk read/write timings on EC2, which clearly needs to be accounted for in any test that uses such timings as a method for detection. Perhaps unsurprisingly, dom0 no long responds in a traceroute, which Bates et al. (2014) have confirmed and so the method is no longer viable.
Whilst simple network probes no longer work, Bates et al. (2014) observes that instances on the same host most likely share the network card and this has led to the development of a technique which allows one instance to inject a watermark into the network flow of another instance (on the same host). This time we require instances A, B and C. As before, instance B is the target, and must be running a publically accessible network service. Instance C is the control instance, whilst instance A is being tested for co-location with B.
As B is running an accessible service, instance C connects to this service and maintains an open connection with it. In doing so instance C establishes a network flow from instance B to itself. Similarly, instance C establishes a network flow from itself to instance A. If A and B are on the same host then they are (most likely) sharing a network interface; that is, both their network flows to instance C are through the same physical network card.
The aim then is to exploit this shared physical network card, with instance A being able to inject a watermark into the flow from B to C. If it is possible for instance A to do this, then A and B are (most likely) on the same host. This technique was demonstrated on a variety of stand-alone virtual systems. It is notable as the only hypervisor agnostic test we have found, as all others seek to exploit properties of hypervisors – as indeed we ourselves do. However, as the authors state, there are a number of defenses against watermarking in place in Public Clouds, and in particular on EC2, and to date the authors have been unable to successfully implement the test on a Public Cloud.
Zhang et al. (2011) develop a technique for detecting the presence of any other instances on the host. The aim of the work is to detect sole tenancy violations. With sole tenancy instances a provider makes a guarantee that the instance will not co-locate with instances owned by another user, however multiple sibling sole-tenancy instances may co-locate.
To detect violations, a cache avoidance strategy is used whereby an instance avoids use of the L2 cache, and the resulting cache speed is measured after doing so. By detecting variation in cache timings, an instance detects if it is sharing the cache with other instances on the same host. To avoid use of the cache requires modifications to the kernel running the guest, which is technically challenging, and having done so the approach has a performance overhead.
In a sole tenancy environment - and assuming no violations of the tenancy agreement - multiple sibling instances can co-ordinate their use and avoidance of the cache and in doing so can detect each other. However, it is not clear if this technique can be extended to a multi-tenancy environment, where we expect the presence of other instances. As such the technique is unproven in this case.
In summary, neither simple network probes nor network flow watermarking co-location tests work on EC2 due to measures already in place, whilst cache avoidance in the multi tenancy environment remains unproven and technically challenging. From the perspective of the performance broker, what is needed is a simple quick test that can be used to determine, with a high degree of accuracy, if 2 instances are co-locating or separated.
As we have seen in section 6.6, co-location detection is a challenging problem, and one which is likely to be amplified in marketplaces such as the CeX as we will (almost certainly) have a multi-hypervisor environment. That is, different providers may use different hypervisors. As such, a variety of different techniques may be required in order to determine co-location amongst a set of instances obtained from the CeX.
The problem of co-location for the broker is somewhat different than for a targeted attack. In this latter case, positive identification is required before an attack is launched. However, it may be simpler to develop a test which can positively identify non co-location, but failure to prove non co-location does not necessarily imply that instances are co-locating. We refer to non co-location as separation and say that non co-locating instances are separated. Proving separation for instances is useful for the broker as there will be no host based performance correlation.
Our first observation is that instances with different CPU models are separated. This is not totally reliable however as a hypervisor can obfuscate its CPU model. However, unless the CeX has hardware requirement for particular CPU models, or a provider is unwilling to expose its hardware in any way, it is not clear why a provider why would wish to obfuscate CPU model. Instances with different hypervisors are obviously separated. It is often possible for an instance to detect which type of hypervisor it is running on, indeed this is usually straightforward unless the hypervisor is purposely configured so as to prevent this. KVM, for example, typically makes use of virtualisation I/O (virtio) in order to improve guest performance when emulating hardware, and a list of kernel modules in the guest will reveal the presence of virtio modules (drivers). To prevent the detection of KVM requires (at a minimum) full emulation of network cards and disk sub-systems.
Under Xen, xenstore (Xen, 2015) is a data area exported from the hypervisor to all instances, the interface of which is a pseudo file system which can be mounted on /proc/xen within a guest. This is analogous to the /proc and /sys pseudo file systems in Linux which provide an interface for user space processes to the Linux kernel. On a standard Xen system any instance can access Xenstore with the following steps: (1) Install the xen-utils package; and (2) mount the /proc/xen filesystem: mount –t xenfs none /proc/xen. If this works successfully then we have identified a Xen hypervisor. To appreciate the extent of use of Xen, we have verified that instances from EC2, Rackspace and GoGrid can all access Xenstore. Under a standard Xen system, an instance can extract information such as the CPU weightings assigned to instances on the host. As one would expect, on EC2 the data exported to the instances via Xenstore is restricted, and does not allow a domain to obtain any information other than about itself. However, Xenstore is particularly useful for our purposes as we can use information within it to detect separation of instances running atop Xen, as we discuss next.
In this section we describe a simple test for detecting separated sibling instances i.e. instances owned by the same owner, appropriate for the Xen hypervisor, which is a widely deployed in Infrastructure Cloud systems, and is in use at EC2, Rackspace, and GoGrid, amongst others. Given its prevalence, we would expect a number of providers on the CeX to use Xen. The test is based on how Xen enumerates guests.
In a standard deployment, each host runs a Xen hypervisor which is responsible for managing the life-cycle of virtual machines on the host. Xen terminology refers to VMs as domains; however, we use the term instances as this is commonly used in the Cloud setting. After booting, the Xen hypervisor starts a privileged instance called domain 0 or dom0. Dom0 provides both an API and a console from which an administrator can create new unprivileged domains, which are referred to as domUs. Upon creation, each instance is assigned a domain identifier, referred to as the domid which serves to uniquely identify instances on the host. Note that as Xen runs on each host it is entirely possible that instances on different hosts may have the same domid. Indeed, Xen also assigns each instance a UUID which uniquely identifies an instance amongst a deployment of multiple hosts.
The domid is a 16 bit integer and allocation starts at 1 and is monotonically increasing with Xen assigning the next available domid, 2, 3 and so on whenever a new domain is started. Xen domids have a rather interesting property: an instance can obtain a new domid simply by rebooting. Upon rebooting, an instance will obtain the next available domid. Indeed, in the case where there is only one instance on a host, the instance can cause its domid to monotonically increment simply by repeated reboots. That is, if the current domid is Y, then the only activity on the host which can cause the next available domid to increment is the activity of the instance itself. In this case, a reboot produces a new domid of Y + 1, a second reboot increments the domid to Y + 2, and after k > 0 reboots the instance’s domid is Y + k.
In multi-tenant (on-demand) environments, we can increment in this fashion by at least a fixed amount but not an exact amount. That is, an instance which reboots itself k > 0 times causes the next available domid to increment by at least k. There may well be instances other than the siblings on the host, which of course also share the next available domid, and whose activity causes it to increment, as well as new instances being launched on the same host.
A domid wraparound occurs when the last domid set is 65536, in which case the next available domid is 1, unless there is an extant domain with that domid, in which it is 2, and so on. For simplicity we ignore wraparounds but note that arguments we make based on domids can be extended to include this case.
We have two initial questions regarding domids that we address: (1) does enumeration on EC2 proceed in the same manner as a standard Xen system?; and (2) Can we estimate the rate of domid increase on a host? To investigate the first question we launch one single-tenancy m4.large on Amazon’s EC2, and collect its domid – 51. Note that as it’s a single-tenancy instance we are guaranteed an absence of neighbours. We reboot this instance 5 times, each time collecting the new domid, obtaining the sequence: 52, 53, 54, 55 and 56. This is the same sequence that a standard Xen system will produce, indicating domid generation on EC2 is the same as found on standard Xen. We repeat for multiple single-tenancy instances and each time we produce the expected domid sequence.
We next launch 4 pairs of m4.large instances in the US-East-1 Region, with each pair being launched in a different availability zone (AZ): us-east-1a, us-east-1b, us-east-1c and us-east-1e. This time we use dedicated-host instances in each AZ. Dedicated-host instances are similar to single-tenancy but allow for placement of instances onto the same host. We know then that each pair is co-locating. For each pair we recorded the domids, and these are shown in the table below:

Table : Domid pairs per AZ.

AZ	Domid Pairs
A	35, 36
B	59, 60
C	21, 22
E	47, 48

We can observe that the domids are consecutive and in a standard Xen system instances started consecutively on the same host would have consecutive domids. For each pair, we reboot the instance with the initial lower domid and in each case its domid incremented by 2 and so has the highest domid. For each pair we repeatedly reboot the instance with the lower domid and each time its domid increases by 2 to become the highest. This behaviour is exactly as we would expect on a standard Xen system and we conclude that Xen guest enumeration on EC2 proceeds in the same manner as a standard Xen system.

In an on-demand system however, the next available domid is incremented whenever a new instance is started on the host or an extant one is rebooted. However, an instance cannot directly observe this, and can only detect this through a reboot. Each host will have a rate of domid increase due to the instance activity. We investigate rate of domid increase by launching 100 m3.medium instances for a period of 3 hours and determining the domid for each instance at the start and at the end of the period. From this we can calculate the rate of domid increase on the hosts the instances are running on. Below we present summary statistics as well as a histogram for our results.

Figure : Histogram of hourly domid increase of 100 m3.medium instances over a 3 hour period in us-east-1 on EC2.

Table : Summary statistics for the hourly domid increase.

Mean	Sdev	Median	Min	Max
1.56	1.64	0.67	0.34	6.67

Next, we leave a t1.small instance running for 180 days. Its initial domid is 2364, and after 180 days we reboot to obtain a new domid of 5584. The rate of domid increase on the host the instance is running on is 0.74 per hour. That the observed rates are low is perhaps to be expected, we certainly wouldn’t expect instances to be regularly rebooted - indeed, quite the opposite. Arguably the majority of the domid increasing activity is instance churn i.e. instances being terminated and so freeing resources on the host for new instances. A large rate of domid increase is likely an indication of pathological behavior on the host, reflecting such behavior from users.

As sibling co-locating instances can both read and change the next available domid on the host we propose to use this as a method for detecting co-location. That is, if instances are co-locating we should be able to coordinate their actions to have a predetermined effect on the next available domid. Sibling instances on the same host, and in the guaranteed absence of non-siblings, can coordinate their reboots, and in doing so can use this to increment the next available domid by at least some agreed fixed amount each time. That is, instance A can increment the next available domid by k > 0 and this is observable by a sibling instance B, as A and B have the next available domid in common. In turn, B can also increment by at least a fixed amount, and again this observable by A.
Suppose now that we have two on-demand instances A and B such that domid(A) < domid(B). If, upon a reboot we still have domid(A) < domid(B) then A and B are separated. However, suppose we have domid(A) > domid(B) – are A and B co-locating? We cannot say this for certain, since the natural rate of domid increase on instance A’s host may have caused it to increase beyond domid(B). We can however, do the following: we alternate k > 0 reboots of A with k > 0 reboots of B. Each time we do this the difference in their respective domid is at least k – if not they are separated. Being able to introduce this minimum distance between domids is strong, although not conclusive, evidence of co-location, particularly if the rate at which the actions of A and B is causing a domid increase is significantly above any observed rate.

To test this, we use 2 pairs of instances, the first pair which obtained domids (7635, 7638) respectively, and the second pair (9536, 9538). As the first pair of instances were on E5-2650 hosts and have close domids they are good candidates for co-location. However, upon rebooting the instance with domid 7635 its new domid was 7636, and so is separated from the instance with domid 7638. For the second pair, again both with CPU model of E5-2650, when rebooting the instance with domid 9536, its new domid was 9539, and so greater than 9538. We rebooted this instance a further 5 times and after the last reboot its domid was 9544. We then rebooted the instance with domid 9538, after which its domid was 9545. This more strongly suggests co-location, particularly as we are able to reboot an instance in 60 seconds and so can increase at a rate of 60 per hour, far in excess of anything observed. We also note again that a user is of course free to set the domid distance to any value they like by rebooting (we set to 6), and to repeat as many times as they wish.

The usefulness of identifying co-location stems from the fact that different workloads run on co-locating instance have different correlation, as discussed in section 5.9. Knowing when instances are co-located would allow the broker to allocate instances from the pool in such a way as to minimise mutual degradation.

1 Although these types of agreements are available as well.

2 Although not part of official definitions, the term Hardware as a Service (HaaS) is increasingly used to distinguish Clouds which offer access to ‘bare metal’ from those that offer virtualised resources. In the latter case, users do not have direct access to hardware although for all intents and purposes virtualised instances of hardware are largely indistinguishable from ‘real’ physical servers, as we explain in more detail in section 2.6. As the vast majority of Public Clouds offering low level resources are IaaS we do not consider HaaS in any detail.

3 We describe hypervisors in detail in section 3.4

4 We discuss Regions in section 2.5.

5 Discussed in detail in section 3.4.

6 John McCarthy, speaking at the MIT Centennial in 1961, “Architects of the Information Society, Thirty-Five Years of the Laboratory for Computer Science at MIT,” Edited by Hal Abelson.

7 Attributed to Bruce Greenwald, Columbia Business School.

8 Discussed in more detail in section 4.5.

9 Renamed as IBM Bluemix.

10 Part of the VMware platform.

11 Providers could set a limit on concurrent vCPUs but allow different sized instances; however, scheduling in this manner typically leads to bin packing problems which are NP-Hard. Various heuristics providing solutions must trade off efficiency of packing with time to complete, and further, as the provider does not know a priori the requests that will arrive, only online algorithms can be used. This will likely drive up costs for users as costs the provider incurs with unused hardware need to be recouped. As such it is not clear what benefit such scheduling has for the provider.

12 As explained in footnote 33.

13 As introduced by Philllip Colella in a presentation: Defining software requirements for scientific computing. Slide of 2004 presentation included in David Patterson’s 2005 talk, 2004. URL http://www.lanl.gov/orgs/ hpc/salishan/salishan2005/davidpatterson.pdf.

14 The VAX-11/780 was the original SPEC reference machine.

15 https://github.com/GoogleCloudPlatform/PerfKitBenchmarker

16 https://www.spec.org/cloud_iaas2016/

17 Slide 19 of https://www.slideshare.net/brendangregg/how-netflix-tunes-ec2-instances-for-performance

18 15/10/2017.

19 Constrained traders will never enter into a loss making trade. For example, a seller with a limit price of 100 will always generate asks above this. Unconstrained traders, ZI-U, may make a loss.

20 This property holds even on hypervisors that are not trap and emulate, such as Xen, which employs para-virtualisation.

21 Although it is now heterogeneous.

22 Phillip Colella 2004 presentation “Defining Software Requirements for Scientific Computing” about DARPA’s High Productivity Computing Systems (HPCS) program presented his list of the “Seven Dwarfs” of algorithms for high-end simulation in the physical sciences.

23 Educational discounts may apply.

24 In total we make use of 10 benchmarks, 6 of which are for compute performance and the other 4 measure memory bandwidth, read and write I/O as well a general purpose system benchmark respectively. This number is in line with number of benchmarks used in extant work, for example Lenk et al. (2011) also uses 10.

25As we have noted, all that is required to run these containers is the Docker daemon installed, and Varghese et al. (2016) propose a promising ‘lightweight’ approach to Cloud benchmarking based on containers.

26 A common concern when benchmarking Cloud resources.

27 We only make use of the m1.small M1 class. We note however, m1.small instances have a rating of 1 ECU per vCPU whereas all other instance types within the M1 class have a rating of 2 ECUs per vCPU.

28 The status of GoGrid Cloud services is not currently clear, after being bought by datapipe in 2015; the accounts we used to access GoGrid are no longer valid. We cannot find any current information on their offerings and prices, and so we cannot make comparisons between cost of doing work on GoGrd and other providers.

29 Rackspace no longer offer a self-service Public Cloud. Instead they offer managed Cloud which requires purchasing of support contracts and longer term commitments to be made. As such, comparisons between Rackspace and other providers with regards to the cost of doing work are difficult.

30 We ensure the dedicated instances do not run concurrently as they would then co-locate.

31 We use a technique based on domids described in chapter 6 to establish this.

32 We use the term client to mean a client of the broker, whilst a user is someone who has an account with a Cloud provider.

33 There are some exceptions: https://aws.amazon.com/blogs/aws/new-per-second-billing-for-ec2-instances-and-ebs-volumes/

34 As at 17/07/2017.

35 We discuss the frequency of such measurements later in section 8.3.4.

36 https://github.com/SWIMProjectUCB/SWIM/wiki/Workloads-repository

37 https://webscope.sandbox.yahoo.com/catalog.php?datatype=s

38 https://www.ons.gov.uk/

39 For NUM_CPUS = 1 we have no spread.

40 As of 20/07/2017.

41 16/8/17.

42 $0.199 per instance hour at 17/08/2017.

43 Unlike, for example, tapes.

44 On Intel CPUs the advertised clock-speed is the minimum the CPU is guaranteed to run at, but may well run higher. This is due to unavoidable and unpredictable variations in the fabrication process, producing CPUs with the same architecture but differences in quality. In particular, not all can run at the same speed. Indeed, the variation is sufficiently large for Intel to engage a practice known as product binning where CPUs manufactured from the same run are put into different ‘bins’ with each bin specifying minimum properties. These bins are typically labelled as different CPU models, helping to explain the multitude of CPU models with apparently little to distinguish between those on offer: https://www.pugetsystems.com/blog/2015/07/09/Actual-CPU-Speeds---What-You-See-Is-Not-Always-What-You-Get-675/.

Yüklə 1,38 Mb.

Dostları ilə paylaş:

1 ... 41 42 43 44 45 46 47 48 49