• No results found

Linux Kernel Modules in Rust

N/A
N/A
Protected

Academic year: 2021

Share "Linux Kernel Modules in Rust"

Copied!
20
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Bachelor Informatica

Linux Kernel Modules in Rust

Paul Lagerweij

January 29, 2021

Supervisor: Drs. T.R. Walstra

Inf

orma

tica

Universiteit

v

an

Ams

terd

am

(2)

Abstract

The programming language Rust has gained a lot of attention in the last years. It is a type-safe language with a unique memory management model. Contemporary research has shown that Rust has inherent security benefits that can greatly improve kernel safety, while at the same time demonstrating great performance. Currently the preferred programming language for these use cases is C, but C is known for unintentionally contributing to many memory related security bugs. Rust on the other hand poses as a potential solution. Adopt-ing new technologies in any existAdopt-ing software is always a challenge, because new software can introduce unknown factors. This concern is increased in kernel development due to the high importance and stability requirements of kernels. More testing and research can be helpful to minimize this concern and to eventually reach the goal of building safer kernels. This project aims to further that goal by looking at the impacts of using Rust compared to C. It seems that Rust is indeed demonstrating great performance when run in a kernel, but its development infrastructure lacks in maturity.

(3)

Contents

1 Introduction 4

1.1 Context . . . 4

1.1.1 Comparing Rust vs C . . . 4

1.1.2 User space and kernel space . . . 5

1.1.3 Social relevance . . . 5 1.2 Research questions . . . 5 1.2.1 Hypothesis . . . 5 2 Background 6 2.1 Design of Rust . . . 6 2.1.1 Ownership . . . 6

2.2 Kernel components in Rust . . . 7

2.2.1 Current Rust kernel development . . . 8

2.3 Kernel and user space . . . 8

2.3.1 Types of kernels . . . 9

3 Implementation 10 3.1 Methodology . . . 10

3.2 Synthetic kernel modules . . . 10

3.2.1 Unexpected performance regression . . . 11

3.2.2 Mitigation against compiler optimizations . . . 12

3.3 User space to kernel examples . . . 12

4 Results 14 4.1 Runtime results of Rust and C . . . 14

4.2 Binary sizes of Rust and C kernel modules . . . 15

4.3 Kernel vs user space . . . 16

5 Conclusion 17 5.1 Discussion . . . 17

5.1.1 Additional remarks . . . 17

5.1.2 Future work . . . 18

Appendices 20

(4)

CHAPTER 1

Introduction

1.1

Context

Rust is a modern programming language focused on safety and performance. It was first in-troduced by the company Mozilla in 20101. Since then, Rust has had continued development

from its open-source community and gained much popularity as shown in a 2020 survey by Stack Overflow where Rust is currently the most loved programming language2. Examples of popular

programs that use Rust are: Firefox, Discord and 1Password. A use case where Rust is still growing is in low-level systems software, for example in operating systems. C is currently the standard language used to build operating systems, because of the performance that C offers. Comparable to C, Rust is statically typed with no runtime garbage collector, which is suitable for performance critical applications. Unlike C, Rust has a secure and transparent memory management model which C completely lacks. Without extra security precautions, C programs are by default vulnerable to memory related bugs. As such, Rust occupies a unique spot as its performance is comparable to C, but Rust does not sacrifice in security.

The benefits of Rust can largely be credited to how Rust manages memory. It accomplishes this using a memory model called ownership. Ownership allows Rust to not require a garbage collector at runtime, but instead automatically determines at compile time when and where memory needs to be freed. Besides ownership, Rust also has other security related benefits com-pared to C, for instance compiler mitigations against out-of-bounds memory accessing and null pointer dereferencing. A more comprehensive explanation of ownership will be given in the next chapter.

1.1.1

Comparing Rust vs C

All these security benefits can however come at a cost in terms of runtime performance, because they result in a compiled Rust program having more code than a similar C program. This is especially important in operating systems where performance is a priority. Previous benchmarks have shown that the performance impact of Rust is negligible compared to C [1]. Currently there are ongoing projects to prove that this also holds true for operating systems. For example, Redox is a modern Unix-based operating system that is completely written in Rust [2]. This seems promising, but to compare performance results of Redox to an in C written OS like Linux is not informative, since these are completely different operating systems and this cannot account for differences in e.g. CPU schedulers and memory management. An informative comparison would require a more equal playing field where such differences are minimized.

Therefore, a comparison will be drawn between C and Rust in a controlled kernel environment. The kernel of choice here is the Linux kernel, given its mature infrastructure and the open-source tools it provides. The Linux kernel is written in C, which makes it a perfect platform for this purpose. By comparing Rust to a default kernel language, this can give an indication of whether

1https://www.rust-lang.org/ 2The survey can be found here.

(5)

Rust is viable for kernel programming. With these components in place, the comparison will be composed of custom Linux kernel modules for pure testing purposes, both in C and in Rust. The tests will focus on synthetic memory workloads, because that is where Rust, with its ownership memory model, differs the most compared to C. In short, these steps allow for reproducible and fair tests which can offer new insights into using Rust for kernel programming.

1.1.2

User space and kernel space

The secondary goal of this project is to investigate the possible use case of using Rust to port user space programs to kernels, in order to increase performance while at the same time maintaining security constraints. The purpose of the traditional strategy of dividing an operating system into a kernel and user space, was to improve security through separating secure and insecure programs. However, with the unique features Rust offers, that division might not be necessary in the future [8]. Therefore, an attempt will be made to document practical examples of porting user space programs to the Linux kernel using Rust and to describe implications of this potential paradigm shift. A more detailed explanation on the differences between user and kernel space will be given in the next chapter.

1.1.3

Social relevance

The impacts that this project can have are related to improving the security of computers in general. Most computers run some type of an operating system which needs to be secure and safe to use daily. In a developer presentation by Microsoft in 20193, they revealed that ∼70% of

their security patches were caused by memory related bugs, for example memory leaks or buffer overflows. With Rust’s unique security features, this statistic could be reduced.

1.2

Research questions

This project will focus on two areas: the viability of Rust in kernel programming and the possibil-ity of porting user space programs to kernels. Rust will be compared against the C programming language in a controlled kernel environment and their performances will be measured accordingly. Additionally, the impacts of porting user space programs to kernels using Rust will be examined with practical examples. In summary, the research questions can be formulated as follows:

1. What are the performance impacts of using Rust in Linux kernel modules compared to C? 2. What are the impacts of porting user space programs to kernels using Rust?

1.2.1

Hypothesis

Since the focus is on synthetic tests of heavy memory tasks, it is reasonable to hypothesize that the results will show some measurable difference between Rust and C. The extra security pre-cautions of Rust might result in slower performance to that of C when pushed to the extreme. Furthermore, the process of porting user space programs to kernels might result in small perfor-mance gains due to benefits of kernel designs. Different kernel designs will be described in the next chapter.

(6)

CHAPTER 2

Background

2.1

Design of Rust

As stated in the introduction, Rust compiles straight to machine code and therefore does not need any runtime interpreter. The compiled program can then immediately be executed on a system when requested without any extra translation steps in between. In general, this causes Rust to be faster than programs that do use interpreters. Interpreters do not immediately execute machine code, but instead first have to parse or translate the program’s source code towards eventual machine code. There are many different methods to perform these interpreter actions, e.g. direct translation or intermediate representations, but all these types of interpreters do in fact execute more steps than compiled languages do. Examples of languages that use interpreters include Java and Python. These languages differ in the way their interpreters are implemented, but their underlying principles remain the same. Namely, at runtime there is no machine code that can be directly executed, but instead there are extra steps that need to take place in order to execute a program. Therefore, the execution time of interpreted programs is greater than that of compiled programs.

The faster runtime performance of compiled programs is clearly a benefit and a reason in using such instead of using an alternative interpreted language. But there are also other differences between the two that need to be considered. For example, compiled programs take longer to develop, because every time a programmer needs to test their code, a complete compilation step takes place first. The development of interpreted languages is therefore faster, especially for languages that use some intermediate representation, since compiling to some intermediate representation is faster that compiling to low-level machine code. Modern compilers have become faster however, which has improved this development time difference.

Besides development time, there exists one other important reason one might prefer to use other higher level languages over Rust. Generally speaking, low-level compiled languages are a lot more difficult to use safely compared to higher level languages. This is because higher level languages offer many abstractions that hide unsafe operations from the programmer. The biggest concern for programming languages is how they handle memory operations. Many high-level languages abstract these operations away from the programmer, but they must then solve this problem somewhere else. Languages like Python and Golang do this by running an extra process at runtime, called a garbage collector. Rust however aims to minimize as much runtime performance penalties as possible by implementing a novel alternative solution.

2.1.1

Ownership

In C, the programmer needs to manually program memory operations, which can lead to mistakes, but Rust takes this burden away from the programmer and gives the responsibility to the compiler using a memory model called ownership. Ownership in Rust works by labeling each value with its owner and each value can only have one owner at a time. Ownership can be transferred and when a value loses its owner by going out of scope, the value can be freed. This allows the

(7)

compiler to keep track of when values are in use and when they need to be freed. Furthermore, the compiler can now also keep track of values in such a way to prevent race conditions from occurring in concurrent applications. An analogy for this principle is a perfect C programmer who infallibly manages memory. As such, Rust does not need a garbage collector at runtime compared to programming languages like Golang, which in turn greatly improves the runtime performance of Rust.

To give an example of how ownership works, a Rust code snippet is shown below:

1 {

2 let s = String::from("hello"); // s is valid from this point forward 3

4 // do stuff with s

5 } // this scope is now over, and s is no

6 // longer valid

This example shows a string being allocated and when the string goes out of scope, it is freed again. Rust is aware of the lifetime of the string at compile time, hence it can compile these mem-ory cleanup operations into the final machine code binary already. Other compiled programming languages that utilize a garbage collector would not be aware of this at compile time and would instead have to execute extra steps at runtime to accomplish the same result. Ownership also entails some interesting implications for certain memory operations that work in other languages, but not in Rust. To illustrate this, below is another code example:

1 {

2 let s1 = String::from("hello"); 3 let s2 = s1; 4 5 // do stuff with s2 6 7 println!("{}, world", s1); 8 }

If this example code would be given to Rust, the compiler will generate an error and not compile, because the compiler knows that this can result in a potential memory bug. The reason Rust sees this is because the ownership of the value s1 is transferred to s2. Then s2 can do whatever it wants with the value it represents, for example manipulate the string or destroy it. When s1 tries to access the string, it may cause a memory error. This is why, to prevent these types of mistakes, the compiler restricts each value to having only one owner at a time to have full control. As a result, Rust has an inherently safer memory model than C and this has been formally proven as well by Jung et al. (2017) [5].

2.2

Kernel components in Rust

Given the previously mentioned benefits of Rust in security and performance, Rust becomes appealing for operating systems programming. Operating systems must be robust and secure, while at the same time being under high performance constraints. If security was the only concern, then writing an operating system in a higher level language, e.g. Python with automatic runtime memory management, would be sufficient and secure. Nevertheless, security is not the only constraint, since operating systems also need to conform to performance requirements. Examples include networking or graphics programs, where the usability of such applications depends on their performance. Therefore, using a high-level language like Python, which might be secure, is too slow and not acceptable for operating systems programming.

Rust on the other hand, meets these requirements and could be used as a sufficient operating systems language. Traditionally the default language to use has been C and that is still the

(8)

default preference today. For instance, the popular operating system Linux is written in C. C has proven to be highly efficient with fine-grained low-level hardware control. The downside however to C as stated before, is that it is insecure if no extra security precautions are included.

2.2.1

Current Rust kernel development

As of the time of writing, there are only a handful of Rust kernel projects that are currently being developed. The relatively young age of Rust compared to other kernel languages, means that kernel development in Rust is still a new and uncertain concept. The most prominent example of a kernel project written in Rust, is an operating system called Redox. Redox is an open-source Unix-based OS that was first introduced in 2015 and its goal is to bring the innovations of Rust to a modern operating system. It should be noted that Redox is the biggest Rust kernel project being worked on currently, therefore it consists of many kernel concepts that could be analyzed separately. However, for simplicity, here Redox will be considered as one project [2]. Another independent project aimed to build a small unikernel completely in Rust. It is named RustyHermit and the developers demonstrated that it is possible to write a complete unikernel in Rust with similar performance results to that of a C unikernel [6]. Different types of kernels will be discussed later.

The introduction of Rust and the subsequent adoption in big projects has since caught the attention of various researchers. One such research project by Levy et al. (2017) examined the security viability of using Rust in a kernel by looking at different low-level operations that Rust must be able to perform safely [7]. They concluded that Rust’s safety features do not inhibit certain fine-grained kernel operations, for example Direct Memory Access, that are normally difficult to execute in a secure manner. Balasubramanian et al. (2017) found similar results when looking at security features in the kernel that would normally require slow and complex runtime services [3]. From these projects can then be concluded that Rust is secure enough for advanced kernel operations.

In terms of performance, there have been similar positive results. A recent research project by Schuermann et al. (2020) demonstrated porting a common VPN driver from C to Rust which reached comparable performance results. They were not able to complete the entire porting process due to unforeseen compatibility challenges between old Linux kernel structures and their new Rust code [9]. Another research project by Ellmann (2018) developed a more complete network driver in Rust, but with a different approach. It focused on the user space domain of an operating system instead of the kernel [4]. In summary, these results give a promising prospect in showing what the performance of Rust is in modern kernel use cases.

2.3

Kernel and user space

The emphasis on the kernel is due to a separation between processes that run in kernel space and in user space. This separation is a form of access control at the lowest software level by allowing and preventing processes from accessing certain parts of the respective hardware it runs on. For instance, not all mundane processes, such as text editors, are allowed to have direct access to the program counter of a CPU. However, essential processes of an operating system do require low-level access to its hardware, such as a CPU scheduler. If all processes would have access to all sensitive hardware components all the time, then this would be an obvious security breach in trivial processes. This is why there exists a low-level separation embedded in an operating system, called kernel and user space.

At the software level this is handled by dividing the total system memory region into two domains. Normal processes run in a limited, user space part of memory and are prevented from accessing other memory regions. More important processes run in the kernel space of memory and have access to everything. Typically, the most important process of an operating system, the kernel, runs in kernel space and manages processes running in user space as well. From now on kernel space is referred to as just kernel. With system memory now safely divided, other hardware components systematically follow. The kernel passes on to the CPU which processes are running in the kernel or in user space, after which the CPU can now delegate hardware privileges to each process.

(9)

With this picture in mind, it becomes clear that everything running in the kernel is essential to the correct workings of a whole operating system. If something were to go wrong in the ker-nel, it could have catastrophic consequences for all processes and even for hardware components. Therefore, all software running in the kernel must be robust, safe and trusted. Consequently, software running in user space depends less on those requirements, since the effects of something going wrong here are less severe than in the kernel. Furthermore, many programs that serve a purpose outside of the kernel, for instance games downloaded from the world wide web, are less robust, secure or trusted. Such programs are much better of running in user space than in the kernel and the majority of written software falls into this category.

Additionally, the software that resides in either the kernel or user space, is structured differ-ently compared to each other. For example in Linux, user space programs share a lot of common code, such as print functions. Therefore, many of those common functions are shared and reused so that programmers do not need to reinvent the wheel in every program. In Linux, this is stored in a common library called libc. Each user space program can then simply bind itself to libc and use all of its standard functions. Besides common print functions, libc also consists of special functions called system calls. System calls serve as an API layer between user space programs and the kernel. When a program needs to execute something special, for instance reading a file, it uses a system call to ask the kernel to perform this on its behalf. The kernel then performs a context switch that temporarily pauses the user space program until the kernel is done with its task. In practice, these system calls and context switches happen all the time in an operating system and this shows the importance of libc. There is however one downside to this principle, namely that context switches are slow, because each time this happens the CPU receives an interrupt signal to halt a process and to execute something different [10].

2.3.1

Types of kernels

Although all types of kernels consist of both a kernel and user space, there exist different imple-mentations on how these are structured. The two main types of kernel designs are: microkernels and monolithic kernels. A microkernel focuses on size and stability by putting as many pro-grams in user space, for example device drivers and file system services. As explained before, this has the benefit of complete memory isolation, since all user space programs are separated from each other. Important processes, such as memory allocators and the CPU scheduler, run in the kernel, because they need access to all hardware components. If a user space program wants to communicate with other memory regions, a message-passing service is used which is managed and verified by the kernel. Microkernels are generally used in embedded and realtime applications, where stability is of utmost importance. Examples include: QNX and seL4. 1

In contrast, a monolithic kernel separates less components than a microkernel. Device drivers and file system services all share the same kernel memory space with other important processes. This has the benefit of not requiring a separate message-passing service, which often makes mi-crokernels slower. Examples include the popular Linux and Windows operating systems. The primary disadvantage of monolithic kernels is that they are less secure and stable, because many processes share the same memory space and can crash the kernel if memory corruption occurs. A security oriented programming language like Rust could improve this.

(10)

CHAPTER 3

Implementation

3.1

Methodology

To compare the performance of Rust and C in a controlled and fair environment, their respective source codes should look very similar. A good starting point is to write the C kernel modules first and then to base the Rust kernel modules on that, since C kernel programming is used as a reference point. The tests will focus on synthetic memory workloads, because that is where Rust, with its ownership memory model, differs the most compared to C. Compiler optimizations will have to be taken into account as well, because those can interfere with the tests by inadvertently removing important code. Afterwards, the time command line tool will be used to measure the Rust and C execution times. The time command is able to measure both user space and kernel CPU times, which is sufficient for these tests. Different iterations of these tests will be run to better understand the performance capabilities of Rust in the kernel. Additionally, the size of the Rust and C binaries will be compared as well using the lsmod command line tool, which can gather binary sizes of loaded kernel modules in Linux.

As discussed previously, user space programs can potentially be ported to the kernel using Rust. This will include searching for existing Rust programs that are suitable for being ported to the kernel. If this is possible, their runtime performance will be compared to their user space counterparts over a number of iterations.

3.2

Synthetic kernel modules

The first step is to determine if it is possible to compile a C kernel module and to measure its performance using the time command line tool. Two simple loops are created that iterate over a number of memory allocations:

1 // User space version

2 for (int i = 0; i < n; ++i) {

3 uint8_t *arr = calloc(329504, sizeof (uint8_t));

4 free(arr);

5 } 6

7 // Kernel version

8 for (int i = 0; i < n; ++i) {

9 uint8_t *arr = kcalloc(329504, sizeof (uint8_t), GFP_KERNEL);

10 kfree(arr);

11 }

(11)

User space version CPU time Kernel version CPU time

n = 10000 0.110 sec 0.214 sec

n = 100000 1.052 sec 1.986 sec

n = 1000000 10.303 sec 19.657 sec

Table 3.1: Initial results of average C runtime performance over 5 runs

The above user space and kernel code slightly differ as a result of different memory APIs, never-theless both loops execute a number of allocations of an arbitrary large zeroed array. The choice of zeroed arrays is due to the fact that standard arrays are lazy evaluated and only get allocated when used, therefore requesting zeroed arrays forces the kernel to perform the allocation imme-diately. The above test demonstrates that it is possible to write a C kernel module, including measuring the execution times of synthetic memory workloads. Furthermore, this also indicates that there exists a difference between kernel and user space memory performance due to their different underlying allocation mechanisms in libc versus in the Linux kernel.

To build Rust kernel modules, an existing Rust framework will be used that will correctly link to the Linux kernel 1. This framework takes inspiration from the standard C Linux kernel

in-frastructure and allows for users to build custom Rust kernel modules through its own toolchain. The resulting code that mimics the above C code as much as possible looks as follows:

1 for _ in 0..n {

2 let _arr: Vec<u8> = vec![0; 329504]; 3 }

Listing 2: Initial synthetic memory task in Rust

The above Rust code does not need to accommodate for differences in user space and kernel APIs, because this is abstracted away by the Rust framework. The runtime performance results of this and all subsequent kernel modules will be given in the next chapter.

3.2.1

Unexpected performance regression

While creating these initial kernel modules, an unexpected performance regression problem was discovered. The above Rust and C loops have similar runtime performance results in user space, but when moving to the kernel, suddenly Rust becomes ∼50% slower than its C equivalent. This is bizarre and unexpected, because at first glance there is nothing in this simple test that would indicate such a drastic performance decrease. Furthermore, the user space versions of these loops experience no such problem, therefore it became clear that there exists most likely a problem in the Rust kernel module only.

After searching through much source code, the cause of this problem was found in a missing feature of the Rust framework. When the kcalloc() function is called in the C module, it allocates an array from kernel memory and it also makes sure that that heap of memory is zeroed before returning it. In order to zero an array efficiently, the kcalloc() function makes use of a special kernel parameter GFP ZERO that optimizes this process. On the other hand, when the Rust module allocates a zeroed array, it does not make use of this special optimization2. The reason that the Rust framework does not use this optimization is because it depends on an internal Rust zeroing mechanism which is not aware of GFP ZERO. To temporarily fix this difference, the C code is changed in order that it does not use the kcalloc() function anymore and instead uses a similar zeroing mechanism to what Rust uses:

1https://github.com/fishinabarrel/linux-kernel-module-rust 2The source code of this can be found here and here.

(12)

1 for (int i = 0; i < n; ++i) {

2 uint8_t *arr = krealloc(NULL, 329504 * sizeof (uint8_t), GFP_KERNEL);

3 memset(arr, 0, 329504 * sizeof (uint8_t));

4 kfree(arr);

5 }

Listing 3: Fix for regression issue in C kernel module

3.2.2

Mitigation against compiler optimizations

The next component to be implemented is a mitigation against compiler optimizations. A pos-sible method of accomplishing this is to introduce unknown runtime variables and a conditional based on those variables. Thereby C and Rust cannot determine what code to exactly compile. The kernel modules now look like the following:

1 int to_be_printed = 42; 2 for (int i = 0; i < n; ++i) {

3 uint8_t *arr = krealloc(NULL, 329504 * sizeof (uint8_t), GFP_KERNEL);

4 memset(arr, 0, 329504 * sizeof (uint8_t)); 5 6 arr[runtime_index] = runtime_value; 7 if (arr[runtime_index] != 7) { 8 to_be_printed = arr[runtime_index]; 9 } 10 11 kfree(arr); 12 }

13 printk(KERN_INFO "Value = %d", to_be_printed); 1 let mut to_be_printed = 42;

2 for _ in 0..n {

3 let mut arr: Vec<u8> = vec![0; 329504]; 4 5 arr[runtime_index] = runtime_value; 6 if arr[runtime_index] != 7 { 7 to_be_printed = arr[runtime_index]; 8 } 9 }

10 println!("Value = {}", to_be_printed);

Listing 4: C and Rust kernel modules with optimization mitigations

The unknown variables are gathered through accessible parameters residing in the /proc filesys-tem that Linux provides. This allows for communication with the kernel modules during runtime. However, this capability was not yet implemented completely in the current Rust framework, therefore this had to be manually added to the framework first3.

3.3

User space to kernel examples

Looking for suitable Rust programs that can be ported to the kernel proved to be challenging, because most Rust programs depend on libc functions with no closely related kernel API

(13)

natives. For example, standard input-output operations typically make use of IO functions that libc provides. Fortunately, there exist a few Rust libraries that explicitly state that they do not rely on libc. Two suitable programs were chosen: a compression and an encryption library 4.

These programs were then ported to Rust kernel modules using the same framework as before. Unfortunately, the kernel version of the compression library generates a kernel panic from a stack overflow of an unknown cause, which the Rust compiler should have prevented. Furthermore, this compression library functions correctly in its user space equivalent. The solution to this problem could not be found, therefore the encryption library remains the only working example of a user space program ported to the kernel:

1 for _ in 0..n {

2 let msg = b"plaintext message";

3 let key = Key::from_slice(b"an example very very secret key."); 4 let cipher = ChaCha20Poly1305::new(key);

5 let nonce = Nonce::from_slice(b"unique nonce"); 6

7 let ciphertext = cipher.encrypt(nonce, msg.as_ref()).unwrap();

8 let _plaintext = cipher.decrypt(nonce, ciphertext.as_ref()).unwrap(); 9 }

Listing 5: Encryption library example for Rust kernel porting

(14)

CHAPTER 4

Results

4.1

Runtime results of Rust and C

The following table and graph show the CPU runtimes of the C and Rust kernel modules over a number of iterations. Each iteration n executes the arbitrary memory allocation of a zeroed array. Increasing values of n are chosen in order to determine if the C and Rust tests scale differently. Additionally, the GFP ZERO optimization results are included as previously discussed, to show C’s performance compared to Rust which is not capable of using GFP ZERO.

C with GFP ZERO C without GFP ZERO Rust

n = 10000 0.214 sec 0.322 sec 0.315 sec

n = 100000 1.986 sec 3.054 sec 3.060 sec

n = 1000000 19.657 sec 30.087 sec 30.134 sec

(15)

The following results depict the CPU runtimes when compiler optimization mitigations are in-cluded:

C with GFP ZERO C without GFP ZERO Rust

n = 10000 0.221 sec 0.316 sec 0.369 sec

n = 100000 1.982 sec 3.089 sec 3.141 sec

n = 1000000 19.607 sec 30.052 sec 30.386 sec

Figure 4.2: CPU runtimes with compiler optimization mitigations

4.2

Binary sizes of Rust and C kernel modules

The following table shows the sizes of the above kernel modules gathered by the lsmod command:

C Rust

binary size 16 KB 798 KB

(16)

4.3

Kernel vs user space

The following results show the CPU runtimes of the cryptography library that was ported to the kernel using Rust:

User space Kernel n = 100000 0.286 sec 0.126 sec n = 1000000 2.771 sec 1.133 sec n = 10000000 27.547 sec 11.204 sec

(17)

CHAPTER 5

Conclusion

5.1

Discussion

When looking at the runtime results of the synthetic tests, it is clear that Rust is as fast as C in heavy memory workloads as long as C does not use its zeroing optimization feature. This indicates that Rust’s ownership memory model does not negatively impact CPU runtime per-formance. The hypothesis of a measurably slower performance result was thereby wrong. Addi-tionally, the compiler optimization mitigations did not seem to matter for either the Rust or C tests. This could be explained by the possibility that those mititgations were not necessary in the first place or perhaps that both the C and Rust compilers are too intelligent and remove the mitigations secretly somehow. In terms of binary sizes, the Rust kernel modules were 50 times bigger than their C equivalents. This is most likely a result of Rust’s security features and its inability to use many kernel API’s that C on the other hand can make use of.

In conclusion, an answer can be given to the first research question of what the performance impacts are of using Rust in Linux kernel modules compared to C: looking at heavy memory workloads, Rust has comparable runtime results to that of C, but it lacks some optimization features that could improve it even further. There is however a negative impact in terms of binary size.

When looking at the results of porting a user space program to the kernel using Rust, it turns out that its performance improved by roughly 60%. This increase in performance is more than was expected and can be explained by the lack of context switches and an increased CPU scheduler priority in the kernel. There were however some issues in finding suitable Rust programs that could be ported due to the dependency on libc in most user space programs.

In conclusion, an answer can be given to the second research question of what the impacts are of porting user space programs to kernels using Rust: runtime performance improves greatly while at the same time not breaching kernel security because of the benefits Rust offers. Having said that, there are currently not many suitable user space programs for this use case. This, and some other observations, will be reflected on next.

5.1.1

Additional remarks

During the implementation process, there were some additional observations made that should be taken note of. The relatively young age of Rust and the framework being used made it dif-ficult to build more tests. Many features were nonexistent in the Rust framework and would have required manually writing wrappers and bindings to alternative kernel function, for instance input-output and network operations. Granted, this is not the only reason this process was chal-lenging. Both Rust’s young age and Linux’s old age worked against each other and made them clash.

Another interesting event occurred when fixing the performance regression issue. During many different debug attempts to find the cause of the problem, at a certain point an

(18)

uninten-tional memory leak was created in the C kernel module that went unnoticed for a long while. This was the result of misreading the documentation of a particular C kernel function which leaks memory when used incorrectly. This type of a mistake would not be possible in Rust, and although this event is not falsifiable, it does suggest that Rust is a safer language to use compared to C. Furthermore, an extra experiment was performed out of curiosity to see what would happen if an out of bounds memory access was attempted in both the Rust and C kernel modules when no additional precautions were present. The Rust kernel module stopped the operation preemptively and generated a descriptive error message. The C kernel module allowed the out of bounds memory access, which consequently corrupted the kernel.

5.1.2

Future work

In closing, the above observations suggest some future projects that can take Rust kernel develop-ment further. For example, one might gain more insights by looking at the compiled assembler machine code of the Rust and C kernel modules in order to compare them at a deeper level. Perhaps possible steps can be taken to improve the binary size of Rust kernel modules. For instance, by taking inspiration from libc and implementing a similar solution for common Rust code and putting that in an accessible kernel module that others can bind to. Moreover, more examples could be looked for of suitable user space programs which can be ported to kernels using Rust. This might result in a paradigm shift in the way the kernel and user space domains have traditionally been used. Having said all this, the biggest takeaway for future projects is to first and foremost improve the Rust kernel infrastructure by making it more feature complete, as this is currently the benefit of using C.

(19)

Bibliography

[1] url: https://benchmarksgame-team.pages.debian.net/benchmarksgame/. [2] url: https://www.redox-os.org/.

[3] Abhiram Balasubramanian, Marek S. Baranowski, Anton Burtsev, Aurojit Panda, Zvon-imir Rakamari, and Leonid Ryzhyk. “System programming in Rust: Beyond Safety”. In: ACM SIGOPS Operating Systems Review, 51(1):94–99 (2017).

[4] Simon Ellmann. “Writing Network Drivers in Rust”. In: Bachelor thesis at Technical University of Munich (2018). url: https : / / sjmulder . nl / dl / pdf / unsorted / 2018 % 20 -%20Ellmann%20-%20Writing%20Network%20Drivers%20in%20Rust.pdf.

[5] Ralf Jung, Jacques-Henri Jourdan, Robbert Krebbers, and Derek Dreyer. “RustBelt: Se-curing the Foundations of the Rust Programming Language”. In: vol. 2. POPL. Association for Computing Machinery, 2017. doi: 10.1145/3158154.

[6] Stefan Lankes, Jens Breitbart, and Simon Pickartz. “Exploring Rust for Unikernel Devel-opment”. In: Proceedings of the 10th Workshop on Programming Languages and Operating Systems. PLOS’19. 2019, pp. 8–15. doi: 10.1145/3365137.3365395.

[7] Amit Levy, Bradford Campbel, Branden Ghena, Pat Pannuto, Prabal Dutta, and Philip Levis. “The case for writing a kernel in Rust”. In: Proceedings of the 8th Asia-Pacific Workshopon Systems, page 1 ACM (2017).

[8] Johannes Lundberg. “Safe Kernel Programming with Rust”. In: Master thesis at KTH Royal Institute of Technology in Stockholm (2018). url: https://www.diva-portal.org/ smash/get/diva2:1238890/FULLTEXT01.pdf.

[9] Zach Schuermann and Kundan Guha. “Linux Device Drivers in Rust”. In: Research project at Columbia University (2020). url: https://zachschuermann.com/static/6118.pdf. [10] E. Zadok, S. Callanan, Abhishek Rai, Gopalan Sivathanu, and A. Traeger. “Efficient and

safe execution of user-level code in the kernel”. In: 19th IEEE International Parallel and Distributed Processing Symposium. 2005. doi: 10.1109/IPDPS.2005.189.

(20)

APPENDIX A

Hardware and software specifications

CPU: Intel Core i7-4710HQ Operating system: Ubuntu 20.04.1 Linux kernel: 5.4.0-60

GCC: 9.3.0

Referenties

GERELATEERDE DOCUMENTEN

“It was a gas-guzzler.” In a poem titled “Ode to the Hummer,” CODEPINK organizer Rae Abileah described the vehicle as a “gas-guzzling war machine” turned “family

compare the results obtained by the MKSC model (considering different amounts of memory) with the kernel spectral clustering (KSC, see [9]) applied separately on each snapshot and

compare the results obtained by the MKSC model (considering different amounts of memory) with the kernel spectral clustering (KSC, see [9]) applied separately on each snapshot and

B. In order to take into consideration the time effect within the computation of kernel matrix, we can apply a Multiple Kernel Learning approach, namely a linear combination of all

Using some synthetic and real-life data, we have shown that our technique MKSC was able to handle a varying number of data points and track the cluster evolution.. Also the

The behaviour of the Linux kernel and the shared memory are modelled as process definitions representing the behaviour they are expected to exhibit when interacting with the

Linux..

Neurons are not the only cells in the brain of relevance to memory formation, and the view that non- neural cells are important for memory formation and consolidation has been