Resource usage in hypervisors.

(1)

Bachelor Informatica

Universiteit van Amsterdam

Resource usage in hypervisors.

Casper van der Poll

August 20, 2015

Inf

orma

tica

—

Universiteit

v

an

Ams

terd

am

(2)

Abstract

The research presented in this thesis are the results of an elaborate research on the power consumption of two open source hypervisors. At the data center of Schuberg Philis both KVM and Xen were tested using stress tests on the main components of the servers in use. The tests were run on generation 8 machines stressing CPU, IO, memory and HDD components. The outcome of these test were comparable so a decision was made to keep track of the performance using sar log files. The results of these tests point towards KVM as a better solution than Xen. The results will be a start towards a scale the Greening the cloud organisation is trying to create by which we can rate software so users will know which software is ideal for a proposed problem.

(3)

(4)

4 Experiments 15 4.1 Measuring Setups . . . 16 4.1.1 Internal measurements . . . 16 4.1.2 External measurements . . . 17 4.2 Setup . . . 18 4.3 Results . . . 19 4.4 Performance . . . 22 4.4.1 Comparing performance . . . 22 4.4.2 Experimental conclusions . . . 24 5 Conclusion 26 5.1 Conclusion . . . 27 6 Future research 29 6.1 Future research . . . 30 7 Appendix 34 7.1 Scripts . . . 35

(5)

(6)

CHAPTER 1

Introduction

1.1 Introduction

The size and usage of cloud computing, which is the on-demand availability of hardware, software and data, is an increasingly popular phenomenon. Your pictures on Flickr, your emails at home or friends on Facebook are all data in the cloud. A cloud consists of data centers all over the world. Data centers consume a lot of energy, and according to Schomaker et al.[25] data centers can be categorized in three different sizes:

size amount of servers usage small 100 - 500 50kWh medium 500 - 5000 250kWh large more than 5000 2.5MWh

Figure 1.1: The energy demand of differentc components in data centers [18]. Percentage of energy consumed by individual data center com-ponents

These data centers consist of many dif-ferent energy consuming components, in figure 1.1 we see a graph that shows some data on the power consumption of these components. To measure how effi-cient a data center runs the PUE is cal-culated. The PUE short for: power usage effectiveness, is calculated by di-viding the IT-equipment consumption by the total amount of energy spent [11]. The outcome of this calculation, how-ever, does not really say anything about the power consumption of the individ-ual components, e.g. the server, let alone about the individual parts of the servers.

The majority of the providers of cloud com-puting make use of a variety of hypervisors to

control their environments. The environments we speak of are mostly virtual machine setups that host websites, email servers or databases.

The hypervisors are thoroughly studied, however, most of these studies are centered on re-search in usage or performance. This is in line with the general consensus in Computer science: ”more, or stronger hardware is always better”. The data centers containing the server stacks do monitor their power consumption and try to improve their setups to create a more effective or green environment. By green we mean the general interest in resource and power usage compared to performance in these soft- or hardware solutions. This behaviour is promoted by google[4] who not only optimises its own environment, but shows interest in the power consumption of competing data centers as well.

(7)

The project described in this thesis will be part of a larger project ’Greening The Cloud’[6]. This group consists of multiple commercial companies large and small and some educational institutions. The HvA, VU and UvA are providing students as well as positions to research soft and hardware using advanced measuring systems. Because of the extended usage of cloud services the amount of CO2 produced by the IT-sector exceeds the amount produced by the aviation sector[7], ”Greening the cloud” tries to research the impact of software on hardware. A big group of commercially orientated businesses are cooperating with ”Greening the cloud” in order to create awareness for these climate effects caused by cloud services. A small selection of participating companies are Greenhost, Schuberg Philis and Leaseweb and even soft and hard-ware providers such as VMhard-ware and GreenIt. These organisations are a small example of the companies connected to this project aimed at obtaining more insight in the exact impact of the power consumption of software on hardware in cloud environments. In this thesis an overview of an elaborate research on the power consumption of two open source hypervisors, KVM and Xen is presented

Firstly, earlier published scientific research will be reviewed, to see if there are insights found that can be used for this thesis’. Secondly actual data of hosting companies will be assessed and analyzed in order to comprehend the workloads used on their machines. Experiments will be run using two different hypervisors and specific loads whilst monitoring their behaviour and power consumption. This research will be done in cooperation with Schuberg Philis. After the results are presented a contribution towards a greenness scale will be presented. By creating this scale we hope to inform companies on how well their setups do and maybe steer them towards better solutions in the foreseeable future.

(8)

1.2 Research questions

The research described in this thesis is but a small part of a solution towards a more green environment in cloud computing. In the very near future the companies and researchers forming the Greening the cloud project would like to be able to research the impact that software has on the consumption of power by the hardware. To be able to do this a lot of research needs to be done on software used in every day use, by doing so ultimately a big scale on which software is rated by the impact it has on the environment can be created. This scale will be the ultimate goal of all research done by the Greening the cloud group. In our own research we will try to make a head start regarding this scale, by researching the power consumption and performance of two open source hypervisors.

For this scale to take shape, hypervisors running in their normal environment have to be studied and at the same time measurements of their performance as well as their power consumption have to be made. Measuring the performance is been done for years now, so adequate setups and software for this to be done will be available. The power aspect is a relatively unearthed aspect of software testing which needs some researching. What changes to normal test setups have to be made to be able to measure power consumption.

When it is clear what needs to be done to acquire our results, a decision needs to be made where our tests will take place, during a meeting of the Greening the cloud project multiple companies volunteered to help us with this research. The moment this decision is made, we will have to asses what is possible at the chosen company for us to do. The next step will be to start testing and try to find which of the hypervisors will perform best under the chosen set of loads. The findings will provide insight into the behaviour of these hypervisors and their power con-sumption based on a load or type of test. The final step is to be able to create an overview on which the findings can be presented, this final part is quite important because it will carry the message and needs to be informative and clear. An important facet of this overview is the ability to be expanded or supplemented, since ”Greening the cloud” only just started by testing these two hypervisors and their final goal of is to research a lot more software. With this in mind the last question will be: How can we present our found results in a way that scales and adapts well?

(9)

(10)

CHAPTER 2

Theoretical framework

2.1 Virtualization

In the late 90’s VMware figured out how to virtualize the X86 platform, which was once assumed to be an impossible feat. VMware’s solution, a combination between binary translation and direct execution allowed multiple guest operating systems to run in full isolation on the same machine. The continuous rapid development is mostly because of the tremendous amounts of savings generated by tens of thousands of companies using this technology. In figure 2.1 the rapid development of virtualization and specifically Xen is shown.

Figure 2.1: Timeline virtualization. c Understanding Full Virtualization, Paravirtualization, and Hardware Assists [14]

(11)

2.1.1 What is virtualization.

Figure 2.2: Schematic virtualization. c Understanding Full Virtualization,

Paravirtualization, and Hardware Assists [14]

Virtualization is a technique which allows us to sep-arate a service request from the underlying phys-ical delivery of that service. Figure 2.2 contains an overview of virtualization using the VMware soft-ware. On the bottom of the figure we see the hardware of the machine, which will be used by all the virtual environments. The second layer shows the x86 architecture and on top of that is the VMware software. This software then supports mul-tiple, in this case two, virtual machine or guests who have their own operating system and are run-ning their own applications. The guests or virtual machines are isolated and will not interfere with each other.

Successful partitioning of a server to support the concur-rent execution of multiple virtual environments running their own operating systems poses several challenges. Firstly, it is very important to make sure the different virtual environments are isolated from one another, this is particularly important because of safety and control of the underlying hardware and its usage. Secondly, a variety of different operating systems need to be

sup-ported so it can accommodate the heterogeneity of popular applications. Thirdly, the perfor-mance overhead introduced by the virtualization should be minimal.

With the X86 computer virtualization a layer is added between the hardware and the oper-ating system of the server. These techniques in virtualization together with the increase of performance potential by desktops and servers has lead to a simplification of software develop-ment and testing as well as enabling server consolidation and enhancedevelop-ment of data center agility and business continuity. For example, servers are now able to run 24/7 with almost no downtime needed for backups or hardware maintenance. VMware states it has costumers using production servers with no downtime for over 3 years[14].

2.1.2 Full versus paravirtualization.

By combining binary translation and direct execution VMware enables itself to virtualize any X86 operating system. This method, where full virtualization is achieved, creates a fully ab-stracted virtual machine from its underlying hardware. This virutal machine is not aware that it is virtualized and will require no modification, this method is the only option which requires no hardware assist or operating assist to virtualize sensitive and privileged instructions. Full virtualization offers the best isolation and security for its guests, in this case virtual machines, and it simplifies migration and portability.

Where fully virtualized guest operating systems have no knowledge being virtualized, a par-avirtualized operating system has this awareness. However, paravirtualization proposes some difficulties, for instance an operating system hosted this way needs to be modified and thus its compatibility and portability is poor. This method can also introduce significant support and maintainability issues because this virtualization method requires deep operating system kernel modifications. On the positive side, its virtualization overhead is minimal, this does vary with different workloads.

(12)

2.2 What is a hypervisor

A hypervisor or Virtual Machine manager is a virtualization technique which allows machines to run multiple instances of operating systems simultaneously and independently of each other. The hypervisor manages these independent installations. At this moment we can classify a hy-pervisor as a type I or type II, these types are defined by their setup.

A type I hypervisor is supplied as a bare metal system, this means that the hypervisor is directly installed on the hardware with no native operating system between it and the hardware. The main benefit of this system is the direct communication between the hypervisor and hardware. The hardware resource will then be paravirtualized and distributed among the virtual machines. A type II hypervisor, also called a hosted hypervisor, is loaded on top of an operating system. This system is with the current level of software enhancements still able to perform optimal, although an extra hop is needed between the virtual machine and underlying hardware.

On the virtualization software virtual machines will be loaded. These virtual machines can consist of virtual appliances, operating systems or other virtualization-ready workloads. The virtual machine will not be aware of the fact that it is virtual, it will think it is its own unit with its own resources. With this technique one server will be able to host a variety of virtual machines running applications instead of single usage of hardware by one party.

The virtualization software is run on host machines. These machines are the physical hosts whose resources are divided between running virtual machines by an administrator. The re-sources include CPU-time and memory. This allocation of resource can be done dynamically and on demand, so a virtual machine running a low priority task would receive less Ram than for instance a virtual machine running a domain controller.

The virtualization software then installs paravirtualization tools on virtual machines. These tools normally provide a set of tools and drivers to make sure the virtual machine runs in opti-mal conditions[8].

This thesis will focus on the performance of two of these aforementioned hypervisors, we will continue by giving a small introduction in Xen and KVM which will be tested later on.

2.3 Xen

Xen, a type I hypervisor, was introduced in the early 2000’s, at this point in time developers started to realize modern computers were sufficiently powerful to use virtualization to present the illusion of many smaller virtual machines each running its own separate operating system[16]. Xen wants to enable users to dynamically instantiate an operating system to execute what-ever they want. There are multiple ways to do this, the easiest is to simple deploy one or more hosts and allow them to install files and start working. However, this needs to be controlled and supervised, and experience shows that this could quickly become a time-consuming exer-cise. More importantly though, this system does not adequately support performance isolation, this means that scheduling techniques, memory demand, network traffic and disk access of one process would have significant impact on other processes running at the same time on the same host[16].

Throughout the development of Xen the developers wanted to separate policy from mechanism, meaning that parts of the machine in control of authorization should not dictate the policies about how decisions are made, wherever this was possible. Although the hypervisor needs to be involved in the data-path aspects, there is no need for it to be involved or even aware of higher level issues, for example how the CPU is shared, or which packet a domain may transmit. The hypervisor itself provides only basic control operations. This resulted in a near equal performance

(13)

between XenoLinux over Xen and a baseline Linux system.

2.4 KVM

KVM or Kernel based Virtual Machine, a type II hypervisor, is a virtualization infrastructure introduced in Februari 2007, it requires the processor to have a hardware virtualization extension to be able to run. KVM itself is not really the software responsible for the virtualization though, it simply enables operating system images to virtualize by exposing the /dev/kvm interface. When VMware started the virtualization business its overhead was immense, consuming almost half of the cpu. Xen later helped at eliminating this issue, nowadays these pains are long gone. Xen brought the overhead back around 10 percent using para-virtualization, however, due to the requirement of a modified kernel it was never fully accepted.

This is where KVM emerged, main CPU chipmakers such as Intel and AMD released exten-sions to allow the processor to virtualize guests without binary translation[3]. This made KVM differ from Xen in its relative simplicity, whereas Xen assumes control of the machine, KVM is part of Linux and uses the regular Linux scheduler and memory management, this means that KVM is much smaller and simpler to use. KVM is a type II hypervisor meaning that it is run as a piece of software on a Linux distribution from which it is able to virtualize a wide variety of guest operating systems[9].

(14)

(15)

CHAPTER 3

Related work

3.1 Concerning hypervisors

The recent growth in cloud computing and storage systems has enforced the development of virutalization through hypervisors, although, for as many different tasks there are different hy-pervisors and for every different task there is a different optimization strategy.

In ’A quantitative comparison of Xen and KVM’ by Deshane et al.[17] results are shown of a study focused around the overall performance, performance isolation and scalability. The re-searchers performed three tests for an overall performance test, a CPU-intensive test, a kernel compile, an IOzone and an IOzone read test. In the CPU-intensive and the kernel compile test Xen scores better than KVM, however, KVM had higher read and write performance probably due to disk caching. Secondly, they tested the performance isolation, this has a high priority especially when multiple virtual machines are run. Xen shows the most promising results in the stress test, however, there is very little isolation for network sender, and no isolation for receiving. KVM shows good isolation for the stress test and unexpectedly for the network send test, though considerably less isolation in the disk and network receive test. The last experiment conducted is about scalability. This research shows that Xen scales excellent and is able to share its re-sources well, whereas KVM is not able to maintain performance as the number of guests increase. Li et al.[21] studied the performance overhead among three hypervisors using Hadoop bench-marks. They experimented with Xen, KVM and a commercial hypervisor, they found that in CPU-bound benchmarks the difference in performance impact was negligible. However, there is a significant difference in write output and completion times between them, according to Li et al. this is due to the efficiency of the request merging when processing a large number of write requests to disk. In their study KVM proves to be able to merge a greater number of requests than the other hypervisors. Li et al. end by stating that ”experiment results show the need to better study the differences among hypervisors in order to better determine which hypervisor to use for user application”[21].

Kumar et al.[24] studied the behaviour and performance of 3 virtualization strategies by set-ting up a cloud using the cloudstack software. The experiments were performed using Xen, KVM and VMware ESXi. Cloudstack is an open source cloud computing software for creat-ing, managcreat-ing, and deploying infrastructure cloud services. It uses existing hypervisors such as KVM, VMware vSphere, and XenServer/XCP for virtualization[2]. Their goal was to test the hypervisors mentioned in cloud environments. The results showed that Xen and VMware ESXi performed equally in almost all tests. With slight advantages of Xen over VMware in the memory and IO tests, and advantages of VMware in the CPU test. KVM though, needs to improve on almost all fronts[24].

In an Evaluation of Different Hypervisors Performance in the Private Cloud with SIGAR Frame-work by Reddy et al.[23], multiple hypervisors are pitches against each other in a Cloud environ-ment and their performance are gathered by the SIGAR API. The hypervisors used are: Xen,

(16)

shown in other research, scalability is an important aspect of virtualization.

These studies state that every hypervisor shows different potential in different environments, carrying different loads. However, they have not taken their power consumption into account, naturally a machine using a lot of energy could potentially outperform a system which uses less.

3.2 Concerning green software

The heightened interest in cloud computing nowadays is rapidly changing the infrastructure of information and communication technologies, this means an acceleration in the deployment of data centers. These data centers are estimated to consume around 61 billion kWh of electricity in 2006 this is double the amount of energy used in 2000. These costs were expected to be doubled again by 2012 resulting in a staggering 4 kWh per person[19].

The use of virtual machines has many benefits as stated in previous sections, though there are some drawbacks. The lack of VM power metering capability takes away the advantages of server power metering that was available in non-virtualized environments[20].

Kansal et al.[20] continue by stating that a complete new mechanism needs to be developed to be able to measure these systems. Their proposed solution, a joulemeter, would use power models based in software to measure energy usage in the system. The next step would be to appropriatly divide loads on different machines showing the best performance using the load that needs to be run. Versick et al.[29] researched this model and found that dividing these loads proves to be very efficient.

According to Orgerie et al.[22] the assumption that homogeneous nodes have the same power consumption[28] is wrong. They state that each server with a different load has a different power consumption, and not only does the loaded software matter but also the way the hardware is assembled plays an important role in its power consumption. A cooler node will consume less energy since its fans will not be used as much as when it is in a warm environment. Software wise solutions are generated for some years now, for instance processors are able to change their working frequency and voltage on-demand so to reduce their consumption. However, studying power usage in different types of servers remains a difficult task.

Based on these studies it becomes clear that energy measuring studies are being undertaken, however, most of them note that in virtualization this is not easily accomplished. The fact that test setups and their consumption are based on internal and external features, as well as the fact that different loads could mean different numbers, basically means a lot of different testing setups are required. To attain a vision or scale which shows how different hypervisors would score hosting different loads, these tests need to be done using different setups.

(17)

(18)

CHAPTER 4

Experiments

An important part of this research is the setup of the test environments and the results that follow from test done using one of these environments. Before we start the testing we do need to answer some questions. After these questions are answered we need to measure the different systems. The questions that need to be answered will be the following ones:

• What is the basic power consumption of an idle server or system?

• What is the power consumption when it is running one or more virtual machines. • Which workloads are being run?

When these questions are answered and the answers are satisfiable we can look at the different measuring setups.

4.1 Measuring Setups

4.1.1 Internal measurements

When we want to measure the system internally we need to install several measuring units between the motherboard and different components. These units then need to be connected to a different system and some software has to be written to create an application which will do the reading and calculations on measured values. This is a big investment since the motherboard needs to be stripped and reassembled with these measurement units. In Amsterdam two scientific labs were investigated with the possibility of utilising these setups, both the VU and the HvA have these systems. At the HvA this lab, the SEFlab, has two servers with installed measuring units, using a generation 8 and 9 processor. The connected components and the setup are shown below: • Two processors • Hard drives • Memory • Main board • Fans

The software used measures the current in multiple chips on the motherboard a thousand times per second and print its mean directly into a file. The software offers some different setups, it is able to show the current or wattage in each of the different parts of the system, as well as the energy used during the execution. This setup allows us to determine the power consumption of the different hardware components during the execution of a program.

(19)

Figure 4.1: c Internal measurement setup at SEFlab[12]

it could provide rare insight in the usage of different parts of the hardware while running differ-ent kind of loads on differdiffer-ent hypervisors. This knowledge in combination with other research such as Versick et al.[29], who researched the possibility of distributing loads based on their fea-tures on different hypervisors, might enforce different choices regarding hypervisors in different situations. This could lead to extended scales which would enable data centers to divide its loads on different kinds of hypervisors based on the loads characteristics. A possible downside to this solution would be that there would be too many different servers which will then not be loaded enough to be energy efficient.

4.1.2 External measurements

To measure a system externally we need an energy providing tool that allows us to read the input of power towards the server at all times. To come to a solid conclusion we first have to gather some data when the system is running idle, the next step would be to run a hypervisor on the system and measure again, each of these tests needs to be done for every different server running all the different hypervisors we need to test. Thirdly, the environment is an important aspect of the power consumption of the server, a full server rack produces a lot more heat than a single server so for instance the fans would have to work extra hard to keep it all cooled down, which in turn needs more energy to do so. If these base line tests are done we should have enough information to create a zero point in the energy usage when the machine is idle. We will then start with the actual spinning up of the guests and stressing the system to see which one would perform best.

(20)

Figure 4.2: External measurement setup

In figure 4.2 we see an external setup, the entire system is powered by power units either one or more. Ideally these power units are connected to a network which can be monitored using an interface accessible by everyone everywhere on said network. The server normally runs some sort of Dom 0 software, this could be Linux with a KVM extension, Xen or any other hypervisor. On the Dom 0 one or more virtual machines can be run.

4.2 Setup

Our test were conducted using the external measurement system at Schuberg Philis using two setups one running Xen and the other KVM. The data centers at Schuberg Philis mainly consist of generation 8 and 9 HP Proliant servers, for our tests we used the generation 8 because the only servers running both Xen and KVM were generation 8 machines. We used a web interface called Grafana to retrieve the power consumption for every test we conducted, these values are measured at the socket and more accurate than those measured by the CPU chips in the ma-chines themselves. These servers have an option to run an energy efficient policy, powering down when there is near to no activity. A big downside to this policy is the latency that is created by powering up when there is activity again. This is reason enough for most hosting companies to not utilize this policy. Because this policy is not adequately used it will not be tested. Firstly, we started by doing a small CPU-stress test, this is a test trying to request as much computation as possible from the CPU in the virtual machines, we did this to confirm and test the setup. As this was deemed successful we scheduled some different tests to stress multiple aspects of the server starting with a combined test and followed by another CPU-stress test, HDD-stress test, IO-stress test and finally a memory-stress test. We will call these dedicated tests since they are dedicated to only CPU or IO etc.

In order to run the tests we need some software to assist us in this endeavor. Firstly, we need some hypervisors, on which virtual machines running CentOs will be deployed. To stress the machines we use the Stress application by Linux. Last but not least, to create our graphs and tables we will use Grafana. Below is a small overview of the software and which version of it was used:

• Xen version 6.2[15]. • KVM version 7.1[10]. • Stress[13].

(21)

• CentOs version 7.1[1]. • Grafana[5].

4.3 Results

To create the environments with the software specifications mentioned in the previous section, we used Cloudstack in combination with some scripts to spin up the required virtual machines on the different servers. Running the tests will take some time, for this purpose and the fact we did not want to accidentally change parameters in possible later tests, scripts were created and run after each other. These scripts are listed in the appendix. We started by a small CPU-stress test to see if all parameters were correctly set and then continued with a serie of tests. The serie consisted of a combination test, stressing all parts of the machine, followed by a CPU-stress test, HDD-stress test, IO-stress test and finally a memory-stress test.

In this section the results from tests using both Xen and KVM running different loads on different machines are presented.

Figure 4.3: The power consumption [Watt] of the initial CPU-stress test. On the x-axis the time [hour] is listed.

Figure 4.4: Results of the first stress test displayed in figure 4.3. Current is the current power consumption, and total is the total amount of Watts used during the test.

(22)

power consumption over the entire test. As we can see Xen uses significantly more energy than KVM if both are idle, this can be traced back to the fact that KVM is a simple extension to the Linux kernel and is able to enter sleep states, whereas Xen is not able to enter these sleep states. These values are in line with measurements done by Shea et al.[26], they state that KVM takes advantage of the standard Linux power saving methods, whereas Xen is not able to use these because it is not allowed to properly manage the power of the system. When we tested both ma-chines it became apparent that Xen had significantly more problems with user input during the CPU-stress test than KVM. These problems mainly consisted of non-responsiveness or extreme delay from user input, this shows that besides using less energy KVM also outperformed Xen. The first peak seen in figure 4.3 is the increase of virtual machines by 10 running on KVM and the stress test being started. In the middle of this a slight increase in power consumption can be noticed, this is the second set of 10 virtual machines being started. Each test will run for 20 minutes, the first descent in consumption at the 16:45 hour mark is because of the first 10 virtual machines powering down. The second descent shows the second set of virtual machines powering down. Our first assumption is a linear increase towards 500 Watt, one virtual machine running this stress test consumes roughly 30 Watt, multiply this by ten and add the idle power consumption and we would reach around 500 Watt. However, as we can see in figure 4.3 there is a significant rise in power consumption when we start the first 10 virtual machines running this stress test. When the second set of 10 virtual machines is started we see a second rise in power consumption but not nearly as high a rise as when we started the first 10 virtual machines. Note that the difference in power consumption by KVM and Xen is minimal, where in idle state there was a difference of around 50 Watts, at the top of their performance the difference is a mere 3 - 5 Watts. It also becomes clear that it is very important for the energy efficiency of these machines that they are used for about 80 to 90 percent, this can be deducted by the small impact the second set of 10 virtual machines had on the total power consumption of these servers. The first increase was more than double since we climbed from 150 Watt to 350 Watt and from 230 Watt to 360 Watt. The second set of virtual machines added only another 20 to 30 Watt depending on which machine was used.

We continued by using the same setup, machine and distribution wise, however, we changed the stress test so we can stress the memory, disk and IO. In figure 4.5 we see the entire graph showing all results measured, note that there are 4 lines in the graph, this is because each server has two power supplies and both are measured. We could not add them using the Grafana web interface because the different graphs sometimes miss a data point which leads to weird dips and rises in the total graph.

Figure 4.5: Overview of the results. From left to right we see respectively, a combined test, CPU test, HDD test, IO test and memory test

(23)

Figure 4.5 shows a series of tests, starting with a combined test the followed by a CPU-stress test, HDD-test, IO-test and a memory test. We will cut the overview image into one smaller image so we can actually see the difference in wattage between the different systems.

Figure 4.6: Combined test results. Figure 4.6 shows the wattage

mea-sured during an intensive combined test, this means that we executed some CPU-stress, allocated memory, did some writes and reads on the hard drive and stressed the IO. In the graph we see that when the systems are idle, Xen uses signifi-cantly more energy this was explained pre-viously. During the combined stress tests we see that KVM is starting to use more energy than Xen, however, this graph does not take into account the perfor-mance, how much blocks were allocated, how many reads and writes were per-formed. In the next section the per-formance aspect of the test will be dis-cussed.

In figure 4.5 the second set of peaks is similar to the graph in figure 4.3, as both represent CPU-stress tests, and explained earlier Xen is outperformed by KVM. The third spike in ac-tivity in figure 4.3 belongs to the HDD stress test, we see a lot of peaks and troughs, the highs and lows can be explained by the wait between different reads and writes. For this

test both Xen and KVM are really similar in their power usage. The fourth graph shows an IO-stress test, where the power consumption does not differ much between the two systems. In the final test, the memory-stress test, we can see that there is again a significant difference between the two systems. KVM shows a constant power consumption, whereas Xen has peaks and troughs, these peaks of the graph do not lie very far from the peaks of KVM but the troughs are significantly lower than those of KVM.

(24)

more this table will be in favor of KVM. In current hosting companies these situations are quite common though, a lot of the time servers will be idle so these numbers are quite relevant. Another important aspect not taken into account is the performance of the machines.

4.4 Performance

The power usage graphs are not entirely edging towards one system, the graphs do lie very close together, so to reach a verdict about the power usage of the systems in different situations we have to take a closer look at the performance output of the systems. When we started the first CPU-stress test we tried to access both servers, however, during the stress test Xen was nearly inaccessible whereas KVM still reacted to user input. Though this is not enough to assume that KVM outperforms Xen in the performance aspect of the every test, it did spike our interest in other performance numbers. During the tests, sar files were created so we could study these if needed, sar, a System Activity Report monitors different system loads, these log files can be read using kSar. This program turns the sar file, which is nothing more than lines of performance numbers, into neat graphs and hereby making these files easily comparable. When viewing the figures below, the left ones belong to Xen and the right ones belong to KVM.

4.4.1 Comparing performance

First we will look at the amount of context switches in figure 4.8 that happened during the tests. Xen has a massive peak, reaching nearly 35000 in the second to last test, at this point in time, KVM’s peak lies around 3500. During the rest of the test, the amount of switches occurring lie around 300 for Xen and around 200 for KVM, so KVM would come out on top regarding these statistics. Context switches occur during the parallel execution of programs on the processor, during every one of these switches the state of the process needs to be saved so when the pro-cessor is ready the process continue from the point it was paused. These switches are relatively expensive since the scheduler needs to reschedule the process whenever a switch occurs. So when the amount of switches increases the performance will go down, and the execution time will increase costing more energy.

Figure 4.8: Context switches during the test of figure 4.3.

Secondly, we take a look at the disk reads and writes in figure 4.9, both in the combined and stress tests. During the combined test Xen was outperformed by KVM but during the dedicated

(25)

tests Xen outperformed KVM. This can be seen in figure 4.9, in the both graphs we see two peaks the first one belonging to the combined test and the second belongs to the dedicated test. The left side belongs to xen and the peaks reach 17,500 and 22,500, whereas KVM reaches 22,500 and 22,000. More alarming is the fact that the amount of faults made by Xen is more than 40 times the amount made by KVM. This means that more to compensate more reads and writes are necessary, and effectively more energy is consumed.

Figure 4.9: Paging activity during the test of figure 4.3.

If we look at the IO graphs in figure 4.10 we see something rather weird, the amount of transfers in the KVM server is significantly less than the amount in the Xen server, so we would assume that the amount of data transmitted would be less in the KVM machine than the Xen machine but it seems that the amount of data is significantly higher in the KVM machine during the combined test and more or less equal during the IO-test.

(26)

As we can see in figure 4.11 the memory usage itself is not really interesting because the quan-tity of memory used is not equal between Xen and KVM, KVM has nearly 8 gigabyte available where Xen only has 4. Knowing this, it is not really abnormal to see the KVM power usage graph rise in figure 4.5 since more memory is available for KVM and more will be used and ultimately more power is needed to do this. We can clearly see the consequences of this difference in the overall power consumption graph. In the combined stress test in figure 4.6 we see a higher KVM graph, the difference in power usage hovering around 40 watts. In the memory-test, figure 4.5, we see a difference of around 20 watts in favor of Xen.

Figure 4.11: Memory usage during the test of figure 4.3.

4.4.2 Experimental conclusions

Based on the power consumption graph in figure 4.5 it seems that Xen is more energy effi-cient than KVM whenever it is not idle, however, as shown above power consumption is not everything. KVM clearly outperforms Xen in most of the computational performance measured. Furthermore, the amount of RAM in the different systems probably accounts for the rather large difference between the two systems in the combined test and the memory test. If we compare these result with recent research by Soriga et al.[27] we see that the advantages that KVM has toward Xen have been noted by them as well. Our experiments were done using twenty virtual machines, to compare these results with the results shown by Soriga et al.[27], we should observe the experiments done using high amounts of virtual machines. The research done by Soriga et al. shows that the difference between the two hypervisors is not extremely big en tends to edge towards KVM, in our research this trend continues as we see that KVM nearly outperforms on almost every test. In the conclusion of the research done by Sorige et al. they state that KVM is always increasing its performance compared to the last version of the software[27]. This statement is applicable at our results as well, regarding the time passed since their research and the fact that KVM did not cease to upgrade their software it is only logical it would eventually start to outperform Xen.

Our results show that when idle and when in use KVM is the better alternative of the two, this is because when idle KVM uses less power than Xen and when used KVM performs com-putationally better than Xen. The power consumption differences when the servers are used is minimal, as long as the servers have a computational usage of around 80 percent the power consumption really is about even. The real advantage is gained when the servers are idle and

(27)

KVM is used instead of Xen. If we assume that data centers are using the current generation of hypervisors the amount of servers running idly could be quite large, and so by using KVM instead of Xen they could save up to 60 watts for every server running in these centers.

(28)

(29)

CHAPTER 5

Conclusion

5.1 Conclusion

The cloud keeps expanding and as long as people are able to read articles, post photos and share life experiences environmental impact does not really seem to bother them. This train of thought is not shared by the hosting companies that enable us to use the Internet, some of these companies have joined hands in trying to work out the impact of the Internet on the environment and how to lower this impact.

The different software used by these hosting parties also has impact on the power consump-tion, these solutions need to be put on a green scale as well. However, this will be a small scale since the companies who participated in this research only used Xen and KVM. Xen was tested in production environments, whereas KVM was only used in testing environments this is because KVM is not used to supply customers of the cloud companies with servers yet. The main reason for companies to choose these hypervisors is the fact that both are open source, this means that the code is published and companies are able to submit alterations and improvements. This means that support will never be terminated as long as companies are using the software. To be able to compare KVM and Xen on their performance and power consumption, tests were conducted. We set up an environment of two hypervisors running twenty virtual machines, one running Xen and the other running KVM. The results show that while running the standard stress test KVM slightly outperformed Xen regarding the performance measurements, however, most power consumption results lie close together. Keeping this in mind we looked at the per-formance of both machines and found that KVM actually outperformed Xen on all but one area, namely IO performance. The biggest difference between the two is their respective power consumption when idle. KVM proves to be much more efficient in shutting down the clock and thereby decreasing the machines usage by roughly 50 watts.

When we compare our results with the results found in the literature study, we see that perfor-mance wise according to these studies KVM is outperformed by Xen, although, the gap between the two slightly diminished over the years. In our results we see that KVM slightly outperforms Xen in most situations, but we have to keep in mind that we tried to look at the power con-sumption of both hypervisors and only resorted to the performance because the results did not provide a clear better solution. The tests we did were simple stress tests that are part of the Linux system, thus caution should be used in comparing our results with the benchmark results found in the literature study.

In figure 6.1 we revisit our first table showing the power consumption of different size data centers and add an extra column showing a potential gain judging by the results we found in our tests. Let us assume that an average server is idle 40 percent of the time. The difference in idle power consumption lies around 60 Watts, if we multiply this by the amount of servers according to the table the potential profit will become clear.

(30)

When we spoke to some of the cooperating companies in the Greening the cloud project they stated that they mostly use the Xen hypervisor. This means that if they would transit towards KVM, when running idly they would save a lot of energy. When the systems are not idle the performance would increase slightly since KVM proved to be a bit better performance wise. Based on our results and gained insights in the matter we can make a rough sketch how a scale might be drawn using four different aspects of a server, IO, CPU, memory and HDD writes. All these aspects will be shown in the scale as well as the performance when the systems are idle. In the scale we would like to show the computational performance of the different systems as well as the power consumption. Our sketch would look something like figure 5.1, again this is a mere first draft for this potential scale.

Figure 5.1: Proposed power consumption and performance scale.

In the scale at figure 5.1 we see 2 sets of red and green lines. The two outer ones depict the performance of the hypervisors, and the inner lines with corresponding colors show their power consumption in relation to each other. In case of the outer lines, the larger the distance from the center the larger the performance on the scale. This also holds for the inner lines representing the power consumption, the larger the distance from the center the larger the power consumption on the scale. To summarize, a good result would be a graph with wide outer lines and inner lines close towards the center. The results shown by the scale are only applicable to the software versions used, the hardware and its position in the data center and the test being done. If either one of these parameters change the tests need to be run again. Whenever there are changes in either the software or hardware a new scale can be created and compared to the old one to see if the consumption and performance numbers still add up.

(31)

(32)

CHAPTER 6

Future research

6.1 Future research

The research presented in this thesis consists of power consumption tests using smart power supplies at the base of the servers. We mentioned earlier the possibility of measuring the power consumption of hypervisors internally. This internal testing could provide insights where the power spikes lie and this could be communicated back towards software engineers building the hypervisors so they could possible adjust their strategies.

A different concern in the hosting business is of course the placement, upkeep and power con-sumption of the server stacks, quite often entire company basements are filled with buzzing heath generating and energy consuming server machines idly gathering dust. In our research we were not in a position to adjust server placements. Although to minimize the energy each server uses to cool its parts an optimal placement strategy needs to be devised. Research in this area could prove to have a lot of impact especially on starting companies. Nowadays entire stacks can be provided by retailers, however, how do these rigs perform compared to hand build systems. In one of the papers quoted above it was stated that server density makes a big difference in power consumption, researching these statements could prove to be enlightening especially since pre built system more often than not are more expensive.

Computers are very energy inefficient, between 1 and 5 percent of the energy going in is coming out through the cables as traffic. Between 95 and 99 percent of the energy is basically heat radiating from the system. To make sure the system does not overheat a smart cooling system is needed to transfer away the heat to a place where it can radiate freely. For every Watt needed to power the system another Watt is needed to cool the system down since the one Watt consumed will become heat and this needs to be cooled again. Ideally one would want a system which is low in energy cost, and easily cooled. In recent years we have seen a staggering increase in smart machines which rely on the ARM chip set, this chip set which has been around for 30 years, has fully blossomed with the arrival of pda’s, ipod’s and other portable electronics. The ARM is very energy efficient, and together with the fact that recently its clock speed was sped up to 2.5 gigahertz makes this chip set a perfect substitution for the conventional hardware in data centers especially since the arm chip is very cheap to create and very scalable in its us-age. This new way of creating servers could mean that energy usage of conventional hypervisors could plummet. The scale that was proposed in the conclusion could be extended with the kind of server it was run on, by doing so an interesting total image of a server system could be reviewed.

(33)

(34)

Bibliography

[1] Centos. https://www.centos.org/. Accessed June 2015.

[2] Cloudstack. https://cloudstack.apache.org/. Accessed august 2015.

[3] Flexiant- common considerations when selecting your hypervisor. http://learn. exiant.com/hs-fs/hub/154734/file-494587875- pdf/Common considerations when selecting your hypervisor.pdf. Accessed August 2015.

[4] Google’s practical tips for efficiency in data centers. http://www.google.com/about/datacenters/efficiency/external/. Accessed august 2015. [5] Grafana. http://grafana.org/. Accessed June 2015.

[6] Greening the cloud. http://www.greeningthecloud.nl. Accessed March 2015.

[7] How companies are creating the green internet. http://www.greenpeace.org/usa/wp-content/uploads/legacy/Global/usa/planet3/PDFs/clickingclean.pdf. Accessed August 2015.

[8] Hypervisor 101: Understanding the virtualization market. http://www.datacenterknowledge.com/archives/2012/08/01/hypervisor-101-a-look-hypervisor-market/. Accessed May 2015.

[9] Kernel virtual machine. http://www.linux-kvm.org/. Accessed June 2015. [10] Kvm. http://www.linux-kvm.org/page/Main Page. Accessed June 2015.

[11] Power usage effectiveness. https://en.wikipedia.org/wiki/Power usage effectiveness. Ac-cessed June 2015.

[12] Software energy footprint lab. http://www.seflab.com/seflab/. Accessed May 2015. [13] Stress. http://linux.die.net/man/1/stress. Accessed June 2015.

[14] Understanding Full Virtualization, Paravirtualization, and Hardware Assist. [15] Xen. http://www.xenproject.org/. Accessed June 2015.

[16] Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neuge-bauer, Ian Pratt, and Andrew Warfield. Xen and the art of virtualization. volume 37, pages 164–177, New York, NY, USA, October 2003. ACM.

[17] T. Deshane, Z. Shepherd, J. Matthews, M. Ben-Yehuda, A. Shah, and B. Rao. Quantitative comparison of Xen and KVM. In Xen summit, Berkeley, CA, USA, June 2008. USENIX association.

[18] Dr. Ralph Hintemann. Die trends in rechenzentren bis 2015. 2010.

[19] Yichao Jin, Yonggang Wen, Qinghua Chen, and Zuqing Zhu. An empirical investigation of the impact of server virtualization on energy efficiency for green data center. volume 56, pages 977–990, 2013.

[20] Aman Kansal, Feng Zhao, Jie Liu, Nupur Kothari, and Arka A. Bhattacharya. Virtual machine power metering and provisioning. In Proceedings of the 1st ACM Symposium on Cloud Computing, SoCC ’10, pages 39–50, New York, NY, USA, 2010. ACM.

(35)

[21] J. Li, Qingyang Wang, D. Jayasinghe, Junhee Park, Tao Zhu, and C. Pu. Performance overhead among three hypervisors: An experimental study using hadoop benchmarks. In Big Data (BigData Congress), 2013 IEEE International Congress on, pages 9–16, June 2013.

[22] Anne-C´ecile Orgerie, Laurent Lef`evre, and Jean-Patrick Gelas. Demystifying energy con-sumption in grids and clouds. In Green Computing Conference’10, pages 335–342, 2010. [23] P.VijayaVardhan Reddy and Lakshmi Rajamani. Evaluation of different hypervisors

per-formance in the private cloud with sigar framework. In P. Vijaya Vardhan Reddy and Dr. Lakshmi Rajamani, editors, International Journal of Advanced Computer Science and Applications(IJACSA), Smart Innovation, Systems and Technologies, 2014.

[24] P.VijayaVardhan Reddy and Lakshmi Rajamani. Performance comparison of hypervisors in the private cloud. In Malay Kumar Kundu, Durga Prasad Mohapatra, Amit Konar, and Aruna Chakraborty, editors, Advanced Computing, Networking and Informatics- Vol-ume 2, volVol-ume 28 of Smart Innovation, Systems and Technologies, pages 393–402. Springer International Publishing, 2014.

[25] Gunnar Schomaker, Stefan Janacek, and Daniel Schlitt. The energy demand of data centers. In ICT Innovations for Sustainability, pages 113–124, 2015.

[26] R. Shea, Haiyang Wang, and Jiangchuan Liu. Power consumption of virtual machines with network transactions: Measurement and improvements. In INFOCOM, 2014 Proceedings IEEE, pages 1051–1059, April 2014.

[27] S.G. Soriga and M. Barbulescu. A comparison of the performance and scalability of xen and kvm hypervisors. In Networking in Education and Research, 2013 RoEduNet International Conference 12th Edition, pages 1–6, Sept 2013.

[28] Malgorzata Steinder, Ian Whalley, James E. Hanson, and Jeffrey O. Kephart. Coordinated management of power usage and runtime performance. In NOMS, pages 387–394. IEEE, 2008.

[29] Daniel Versick, Ingolf Wa, and Djamshid Tavangarian. Power consumption estimation of cpu and peripheral components in virtual machines. volume 13, pages 17–25, New York, NY, USA, September 2013. ACM.

(36)

(37)

CHAPTER 7

Appendix

7.1 Scripts

In this section the scripts used to perform the tests will be shown. The combined test script.

#!/bin/bash name=combined runtime=1200 start=$(date +"%s") end=$(($start+$runtime)) now=$start pid=$$

echo "+ start[$pid] - time: $start, date: $(date)" count=0

while [ "$now" -lt "$end" ] do

$(stress -v -c 2 -i 2 -m 2 --vm-bytes 750M -d 1 --hdd-bytes 2G -t 1260s) sleep 1

now=$(date +"%s") let count=$count+1

echo "++ meantime $name[$pid] - loops: $count, runtime: $(($now-$start)), date: $(date)" done

echo "+ done $name[$pid] - now: $now, start:

$start, runtime: $(($now-$start)), count: $count" The CPU test script.

#!/bin/bash name=cpu runtime=1200 start=$(date +"%s") end=$(($start+$runtime)) now=$start pid=$$

$(stress -v -c 2 -t 1260s) sleep 1

(38)

runtime: $(($now-$start)), date: $(date)" done

echo "+ done $name[$pid] - now: $now, start: $start, runtime: $(($now-$start)), count: $count" The HDD test script.

#!/bin/bash name=hdd runtime=1200 start=$(date +"%s") end=$(($start+$runtime)) now=$start pid=$$

$(stress -v -d 2 --hdd-bytes 1G -t 1260s) sleep 1

echo "+ done $name[$pid] - now: $now, start: $start, runtime: $(($now-$start)), count: $count" The IO test script.

#!/bin/bash name=io runtime=1200 start=$(date +"%s") end=$(($start+$runtime)) now=$start pid=$$

$(stress -v -i 2 -t 1260s) sleep 1

echo "+ done $name[$pid] - now: $now, start: $start, runtime: $(($now-$start)), count: $count" The memory test script.

#!/bin/bash name=memory

(39)

runtime=1200

start=$(date +"%s") end=$(($start+$runtime)) now=$start

pid=$$

$(stress -v -m 4 --vm-bytes 750M -t 1260s) sleep 1

echo "+ done $name[$pid] - now: $now, start: $start, runtime: $(($now-$start)), count: $count"

Resource usage in hypervisors.

Bachelor Informatica

Universiteit van Amsterdam