• No results found

Open-source network operating systems: feature evaluation of SONiC

N/A
N/A
Protected

Academic year: 2021

Share "Open-source network operating systems: feature evaluation of SONiC"

Copied!
54
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Bachelor Informatica

Open-source network

operat-ing systems: feature

evalua-tion of SONiC

Erik Puijk, 11017651

June 7, 2018

Inf

orma

tica

Universiteit

v

an

Ams

terd

am

(2)
(3)

Abstract

Open network switches are increasing in popularity and allow the deployment of dif-ferent open-source network operating systems (NOS). In contrast to locked-in switches, open switches with an open-source NOS have been tested less extensively as they are quite new phenomena. This thesis examines whether open switches with an open-source network operating system, namely SONiC, can be deployed successfully to perform fundamental networking operations. Furthermore, it examines for what use cases SONiC is suitable and beneficial. We experiment with open switches in various topologies to examine the deploy-ment of fundadeploy-mental OSI Layer 2 and Layer 3 networking features. We conclude that all SONiC-supported features we tested can be deployed successfully. Moreover, we examine several use cases of open switches with SONiC and conclude that SONiC is most suitable for use cases in large-scale data centers and enterprise networks.

(4)
(5)

Contents

1 Introduction 7

2 Open networking background 9

2.1 Open switches . . . 9

2.2 Network operating systems . . . 10

2.2.1 ASIC communication . . . 10

2.2.2 ASIC control module . . . 10

2.3 SONiC . . . 11 2.3.1 Quagga . . . 12 3 Networking features 15 3.1 Layer 2 features . . . 15 3.2 Layer 3 features . . . 17 4 Experiments 19 4.1 Preparatory phase . . . 19

4.1.1 Mellanox SN2100 and ONIE . . . 20

4.1.2 Arista 7050QX-32S and Aboot . . . 20

4.2 Feature tests . . . 21 4.2.1 Layer 2 features . . . 21 4.2.2 Layer 3 features . . . 26 4.2.3 Result overview . . . 28 5 Discussion 29 5.1 Ease of use . . . 29

6 Use case scenarios for open switches 31 6.1 Current use cases . . . 31

6.2 Example use cases . . . 31

7 Conclusions 35 7.1 Future research . . . 35

8 Acknowledgements 37

9 Bibliography 39

(6)
(7)

CHAPTER 1

Introduction

Network switches are essential in computer networks, to connect devices and forward frames between them on the OSI data link layer (Layer 2)[1]. Some switches are also capable of OSI Layer 3 features, such as the routing of segments among networks using IP. In this thesis, we discuss exactly this category of devices. Traditionally, switches are sold with locked-in hardware with a pre-installed network operating system (NOS) on them, without the possibility for a network administrator to install third-party NOS’s or other software on them.

An open switch, in contrast, does allow the user to install another operating system on the device. Open switches thus give network administrators more possibilities to customize the switch to their own needs, possibly explaining their rising popularity. Another advantage of this category of switches is the reduction of expenses, due to the possibility to install low-cost software. In the past, this cost reduction would be compensated by an increase in operational costs, because using the switches would require hiring external Linux-expertise to configure the switches [2]. However, as Linux expertise has grown over the past few years, this barrier for using open switches has become much thinner. Also, large manufacturers such as Dell and HP have been developing open switches on which NOS’s like Cumulus Networks or Pica8 are already installed [3]. This removed another obstacle from companies using the switches because no manual installation is required.

The increasing popularity raises questions about the suitability for use in real networks, the ease of use and how their feature sets compare to those of traditional switches. Considering that open switches are relatively new compared to locked-in switches, there is still need for testing and evaluation to assess whether open switches can replace traditional switches without loss of functionality, performance or ease of use, which the reduction in costs might not weigh up to.

In this context, we examine the functioning of open switches running an open-source network operating system, namely SONiC, in a network. We study whether open switches with SONiC are able to deploy several fundamental networking features. In addition, we examine which use cases benefit from the open and flexible nature of open switches with SONiC. We therefore set out to answer the following two research questions:

1. Which networking features can be successfully deployed on open switches with SONiC?

2. Which use cases are (more) easily supported by open switches with SONiC?

Chapter 2 provides background information about open networking, open switches, network operating systems (SONiC specifically) and routing suites. Chapter 3 will briefly discuss sev-eral networking features that will later be used in our experiments. Chapter 4 contains the experiments we performed to answer our first research question. In chapter 5, we discuss the experimental methods and results and other findings obtained during this research. In chapter 6, we examine several use cases of open switches with SONiC. Lastly, in chapter 7 we return to our research questions, formulate a conclusion and suggest possibilities for future research.

(8)
(9)

CHAPTER 2

Open networking background

2.1

Open switches

Generally, a switch can be represented as a stack of four layered components. Figure 2.1 shows these components.

Control and management plane

Network operating system

Hardware

Silicon/ASIC

Figure 2.1: Layered component stack of a switch.

The silicon, or ASIC (application-specific integrated circuit ), is a specified hardware element designed for a specific task. In the case of switches, this task is to quickly send packets through the network [4]. The hardware-layer includes all other physical components of the switch, like the interfaces, the input/output-ports, the LEDs and the power supply [5]. The network operating system (NOS) controls the hardware and the underlying ASIC for networking purposes and allows control and management plane applications to use the hardware. Control plane and management plane applications provide particular features to the user of the switch, in addition to those of the underlying operating system [6].

To understand the difference between open switches and traditional switches, one needs to consider the manner in which the above components interact with each other. Switches in which the NOS and the underlying hardware are disintegrated, meaning that they can be changed independently of each other, are called open switches. In traditional (locked-in) switches, this is not possible, for the switch is delivered with pre-installed software that cannot be changed. Open switches thus give the user more choice in what NOS to run.

Open switches can be separated into subcategories. Bare metal switches provide the hardware required to run, and allow the user to load the NOS of choice. The manufacturers of bare metal switches are original design manufacturers (ODMs) for well-known switch merchants, which means that the ODMs products are re-branded and sold by other companies. A boot-loader allows the user to boot an NOS of choice on the device [7].

(10)

2.2

Network operating systems

A network operating system is a key component in the aforementioned component stack of open switches. The NOS controls the hardware of the device and provides applications on the switch with the hardware and software resources they need. These resources might for example be memory allocation or input and output resources.

2.2.1

ASIC communication

The aforementioned application-specific integrated circuits (ASICs) are designed to handle a specific task. In networking switches, the ASIC is designed and optimized for quickly processing incoming packets according to the routing table. In order for an NOS to be able to program the ASIC, several APIs have been developed to communicate with the ASIC. The Switch Abstraction Interface1 (SAI) is a well-known method. It is an open source framework that aims to abstract

away from the ASIC, which differs for each vendor, so software can be programmed for use in multiple different switches without any changes. This allows for more freedom in the use of software independently from the hardware choice [8].

Another, less adopted method for ASIC communication is the use of SDKs developed by the ASIC vendor. In practice, this approach is not included in open-source software for open switches considering changes in the SDKs would need the application to be modified. Examples of these SDKs are SwitchX SDK by Mellanox [9] and OpenNSL by Broadcom [10].

2.2.2

ASIC control module

On top of the ASIC API, the ASIC control module provides an interface for control plane ap-plications to communicate with the hardware. It also presents the current state of the hardware to the user of the switch. Control plane apps can use the ASIC control module to read from or write to data stored in the hardware. These applications therefore are independent from the hardware in the machine they are running on. Figure 2.2 shows the role of the ASIC control module and the ASIC API in the component stack illustrated before.

Control and management plane Network operating system

Hardware

Silicon/ASIC ASIC control module

ASIC API

Figure 2.2: ASIC control module and ASIC API in the layered component stack.

(11)

2.3

SONiC

SONiC2 (Software for Open Networking in the Cloud) is an open-source network operating

system that claims to include all features to have a fully functional Layer 3 network device. It is under constant development by Microsoft. The latest release, SONiC.201803, supports features including BGP, LLDP, link aggregation/LACP and VLAN (trunking) [11]. SONiC is Linux-based and runs on Debian Jessie. The SONiC architecture is depicted in figure 2.3.

Figure 2.3: An overview of SONiC components [12].

SONiC uses SAI for programming the ASIC, which makes SONiC compatible with all ASICs that are supported by SAI. The SONiC Object Library3allows external applications to interact with each other and with SONiC applications.

Switch State Service

The Switch State Service (SwSS) acts as the ASIC control module for SONiC, using a database to interface between the network applications running on the switch and the switch hardware. Figure 2.4 shows the SwSS architecture. Network applications use a database named APP DB for reading and writing. Orchestration agents are responsible for synchronization between APP DB and another database, namely ASIC DB. To provide universality, the databases are set up as key-value storage. SwSS allows network applications running on SONiC to be completely independent from the hardware they are running on.

2https://github.com/Azure/SONiC/wiki

(12)

Figure 2.4: The design of Switch State Service, showing the relation between APP DB, ASIC DB and the orchestration agents [12].

Since SONiC is open-sourced, the networking community can contribute to the improvement of SONiC. In addition, SONiC can be extended with other open-source software, be it third-party or proprietary. Figure 2.5 shows how SONiC is a part of the Open Compute Project, which is an organization that has as its main goal to design and deploy the most efficient and scalable hardware for use in data centers. OCP certified switches can use SONiC through SAI to run a Layer 2 or Layer 3 switch, and in addition allows the user to extend the NOS with third-party or OCP software components [13].

Figure 2.5: SONiC as an open switching platform for OCP users [13].

2.3.1

Quagga

Routing suites are control plane software collections that offer a variety of routing protocols to run on top of a network operating system. A routing suite exchanges routing information with other routers and updates routing information in the kernel. Quagga4 is an example of a routing suite and provides BGP routing functionality for SONiC. It is open-source and currently supports GNU/Linux and BSD.

There are two main Quagga processes present on a SONiC device: bgpd and zebra. Figure 2.6 shows how Quagga and SONiC interact when a BGP route advertisement is received. First, this advertisement is passed to bgpd, short for BGP daemon. The BGP daemon processes this

(13)

done with fpmsyncd, which is a Forwarding Plane Manager and programs the forwarding plane. Fpmsyncd uses the SONiC Object Library API. SONiC handles updating the kernel routing table. Zebra is also capable of selecting the best route across different routing protocols, but as SONiC supports only BGP this is not applicable to a SONiC device.

Figure 2.6: The interaction between Quagga and SONiC when a new BGP route is received [12]. Quagga determines whether a new route should be placed in the routing table, after which SONiC takes care of updating the kernel routing table.

(14)
(15)

CHAPTER 3

Networking features

This chapter briefly covers several networking features or protocols that are relevant to this research. We selected both OSI Layer 2 and 3 features that we consider fundamental to a data center switch.

3.1

Layer 2 features

Layer 2 of the OSI model is the data link layer. It provides mechanisms to allow communication between devices within the same local area network (LAN). Data elements transferred on the data link layer are called (link-layer) frames. A frame is transferred between two devices on a physical link, and MAC addresses are used to address link-layer frames. The Ethernet protocol is a well-known example of a Layer 2 protocol and is used often in local area networks and wide area networks. WiFi is an alternative technology, although it is used only in local area networks [14].

LLDP

The Link Layer Discovery Protocol (LLDP) allows networking devices to advertise information about their identity and capabilities. LLDP is useful for managing network resources and provides a vendor-independent mechanism for devices to exchange device information within a network. Each interface of a device sends out a frame consisting of an LLDP Data Unit (LLDPDU). An LLDPDU consists of a sequence of type-length-value (TLV) structures, each containing specific information. Each LLDPDU contains the following mandatory TLVs: Chassis ID, Port ID and Time To Live. Furthermore, optional TLVs can be included as well, such as system name, system capabilities and port description. LLDPDU frames are sent at a fixed interval, to keep the information up-to-date [15].

LACP

The Link Aggregation Control Protocol (LACP) can be used to combine multiple physical ports and form a single logical channel. This process is also called link aggregation. To higher layer pro-tocols, the aggregated links form a single channel. Link aggregation provides increased through-put due to the use of multiple physical links. It also adds redundancy because even with a physical link-failure, traffic is still capable of flowing through the logical link by making use of the other (operative) physical links [16].

STP

The Spanning Tree Protocol (STP) aims at constructing and managing a loop-free topology in a bridged network. STP is supposed to prevent loops in a topology and guarantees that there are unique paths to all destinations within a topology. This is critical in local area networks, for

(16)

loops can cause Ethernet frames to be switched around endlessly. For instance, when a switch receives an ARP packet, it broadcasts this packet through all of its interfaces (except the one the ARP packet was received on). If a loop exists in the network of the switch, it will eventually receive its own broadcast ARP packet, which will be broadcast again. This endless process is known as a broadcast storm and it is one of the phenomena STP should prevent [17]. It does so by selecting a Root Bridge, and then creates a loop-free topology by blocking particular interfaces such that all other bridges have a shortest path to the Root Bridge [18].

Figure 3.1: Example of how STP creates a loop-free topology with shortest paths to the Root Bridge [19]

Figure 3.1 shows an example of a topology with a loop. Switch A is selected as Root Bridge, thus all other switches should have a shortest path to switch A. By blocking the link between switch B and switch C, the loop in the topology is removed and all switches have a shortest path to switch A.

VLAN (trunking)

A Virtual Local Area Network (VLAN) can be described as an isolated and partitioned broadcast domain at OSI Layer 2. A Local Area Network (LAN) can be partitioned into several Virtual LANs to create multiple (logically) separate networks despite the (possible) presence of a physical connection to other devices that are not in the VLAN. This is achieved by configuring VLANs on a switch and specifying which switch interface belongs to which VLAN. The ability to define isolated broadcast domains using software makes VLAN a well-known networking feature. It allows network administrators to create multiple separated broadcast domains without having to change the physical wiring of the network [20].

A trunk link can be used to transport frames from multiple VLANs, for instance to inter-connect two switches that have configured the same VLANs. This is achieved by using tags in link-layer frames to specify what VLAN a frame belongs to. These trunk links are therefore called ”tagged” links. Links that can only transport frames from one specific VLAN are called ”untagged” links and thus the frames that are transported on these links do not contain a VLAN tag. Figure 3.2 shows an example. The entirely blue or entirely red links are untagged links for VLAN 100 and 200 respectively, and the frames transported on these links do not contain a VLAN tag. The link between switch 1 and 2 is a tagged link, for it must transport frames from both VLAN 100 and 200 and thus a tag must be used to indicate the corresponding VLAN for a particular packet.

(17)

Figure 3.2: Example of a VLAN configuration with two switches and two VLANs [21]. The trunk link between switch 1 and 2 transports traffic from both VLAN 100 and VLAN 200.

3.2

Layer 3 features

Layer 3 of the OSI model, the network layer, is responsible for the routing of data between different local area networks (LANs). Network-layer packets are addressed using unique and hierarchical IP addresses [14]. Routers are devices that forward packets to their destination by determining the route to be taken by the packet to reach its destination. In a network, routers communicate with each other to exchange information about the availability of subnets. This is done by routing protocols, which in addition define policies or algorithms that determine the best route to a destination when multiple routes are available.

Inter-VLAN routing

Inter-VLAN routing is used when traffic must flow between different VLANs, which in some cases may be necessary. Typically, this is accomplished by specifying a VLAN interface for each relevant VLAN, allowing the switch to route the traffic between VLANs through these VLAN interfaces. Figure 3.3 shows an example of a router (R1) that routes between VLANs. When PC A sends traffic destined for PC B, the traffic is first tagged with VLAN 20. When this traffic reaches router R1, it routes this traffic to VLAN 30 by using the configured VLAN interfaces. Then, this traffic can proceed to PC B with the VLAN 30 tag.

Figure 3.3: Example of a topology in which inter-VLAN routing is needed [22]. Router R1 uses the VLAN interface addresses to route the traffic from VLAN 20 to VLAN 30.

(18)

BGP

An autonomous system (AS) is a selection of IP prefixes that is managed by a certain network administrator. Commonly, such autonomous systems are managed by internet providers or large companies. The Border Gateway Protocol (BGP) is a well-known routing protocol used for routing between different autonomous systems, but can also be used to route within an AS.

Routers running BGP can set up sessions with each other using TCP to exchange routing information, for instance subnet availability. BGP provides several mechanisms to control the propagation of routes, such as route-maps. Route maps allow the network administrator to define rules for certain prefixes [23]. For instance, there may be rules to drop an incoming route or change it before placing it in the routing table. BGP is used commonly in large-scale data centers because it scales well compared to other routing protocols [24].

(19)

CHAPTER 4

Experiments

4.1

Preparatory phase

To examine whether open switches with SONiC can be deployed successfully in a network, we decided to perform experiments on the deployment of the fundamental networking features that were described in chapter 3. The SNE OpenLab has provided two open switches for this research.

• Mellanox SN21001

• Arista 7050QX-32S2

The ASICs in these switches are manufactured by different vendors: Mellanox and Broadcom, respectively. This allows us to test two different ASIC types at once, and shows whether SONiC can run on two different ASICs. Both switches are capable of operating on OSI Layer 2 and Layer 3. Figures 4.1 and 4.2 show photos of Mellanox SN2100 and Arista 7050QX-32S, respectively. More details of the switches can be found in Appendix A.

Figure 4.1: Mellanox SN2100.

Figure 4.2: Arista 7050QX-32S.

1http://www.mellanox.com/related-docs/prod_eth_switches/PB_SN2100.pdf

(20)

The open-source NOS used in the experiments is SONiC. Section 2.3 has briefly explained the architecture of SONiC and what features it supports. SONiC is under constant development and new features are added regularly. In addition, the absence of other network operating systems that support a wide variety of open switches made SONiC an obvious NOS to experiment with. Aside from SONiC, there are various open-source network operating systems, such as OPX. SONiC, however, supports a wide variety of switches and thus allows for the straightforward extension of this research to other supported devices. Details about the version of SONiC used in this research are provided in Appendix B1.

4.1.1

Mellanox SN2100 and ONIE

Mellanox SN2100 comes with Mellanox Onyx as NOS. Other operating systems can be installed over ONIE3 (Open Network Install Environment). It is an open-source project by Cumulus

Networks that allows creating an open networking environment and is present by default on various open switches [25]. Figure 4.3 shows how ONIE can be used to boot a network operating system. When a device boots for the first time, a low-level boot loader boots ONIE from flash

Figure 4.3: Process of first time boot using ONIE [25].

memory. ONIE then fetches the image of the NOS supplied by the switch vendor and installs this image. The next time the device boots, ONIE is not used by default and the device goes straight in the NOS. ONIE can, however, still be used to uninstall the vendor NOS and install another (open source) NOS from an image provided over the network or via USB for instance [25].

4.1.2

Arista 7050QX-32S and Aboot

Arista 7050QX-32S runs Arista EOS by default. A component of this operating system is Aboot4, which can be used to boot image files to run other operating systems. Boot parameters can be configured in files which are stored in flash memory. In addition, Aboot provides a shell which can be used to modify boot configuration [26].

(21)

4.2

Feature tests

In the last chapter, we discussed several networking features we consider crucial to a network switch. The objective was to test whether each of the features worked correctly on Mellanox and Arista running SONiC. The results of the experiments possibly provide an insight into whether the switches can be deployed in a data center network as a replacement of traditional switches. For most experiments, we used SONiC’s config db.json configuration file to set up our testing environment. Appendix C contains the most relevant configurations. Complete configuration files can be found in the GitHub repository5.

4.2.1

Layer 2 features

LLDP Methods

The first feature to be tested was LLDP. To verify whether LLDP correctly exchanged device information, we considered the topology depicted in figure 4.4. We wanted to verify whether all mandatory LLDP TLVs were exchanged between Mellanox and Arista. This mandatory information is as follows [15]:

• Chassis ID • Port ID • Time To Live

Furthermore, we examined whether any additional information, such as system information and device capabilities, is exchanged through LLDP as well.

Mellanox SN2100 Ethernet32 Ethernet36 Arista 7050QX-32S Ethernet16 Ethernet20

Figure 4.4: Topology to analyze LLDP with Mellanox and Arista.

Results

SONiC provides the command show lldp neighbors to view the LLDP neighbor information registered by SONiC. Appendix D1 presents the complete output of this command. Both devices presented the mandatory LLDP information in their LLDP tables. SONiC runs LLDP so that the Chassis ID that was included is taken from the MAC address of the eth0 interfaces on both switches.

In addition, LLDP exchanged correct system name and system description information, and also included device capabilities and which of these capabilities were currently in use.

LACP

Methods

In addition, we tested LACP/link aggregation behaviour. With the topology depicted in figure 4.5, we tried to configure link aggregation combining the double links between Mellanox and Arista. We considered both Layer 2 link aggregation and Layer 3 link aggregation. The latter has an IP address configured for each link aggregation interface, while the former does not.

A working port channel should provide redundancy in the sense that when one of the links fail, communication is still possible through the port channel by making use of the other link. Thus, to test an aggregated link, one can simulate a physical breakdown of one of the two links and examine whether communication is still possible between the two devices.

(22)

Mellanox SN2100 Ethernet32 Ethernet36 Arista 7050QX-32S Ethernet16 Ethernet20 PortChannel1

Figure 4.5: Topology to analyze LACP with a port channel consisting of two physical links between Mellanox and Arista.

Results

Appendix C1.1 shows the configuration we set for a Layer 2 port channel between Mellanox and Arista. We were unable to configure a successful Layer 2 port channel. That is, SONiC could not successfully set up a port channel with no interface addresses. PortChannel1 failed to go UP on both devices. Listing 4.1 shows that PortChannel1 was stuck in DOWN state.

a d m i n @ s o n i c - m e l l a n o x :~ $ ip a s h o w dev P o r t C h a n n e l 1

4: P o r t C h a n n e l 1 : < NO - CARRIER , B R O A D C A S T , M U L T I C A S T , UP > mtu 9 1 0 0 q d i s c n o q u e u e s t a t e D O W N g r o u p d e f a u l t

l i n k / e t h e r ec :0 d :9 a :8 d : f1 : c0 brd ff : ff : ff : ff : ff : ff

Listing 4.1: PortChannel1 failed to go UP.

Moreover, with teamdctl we were able to view the state of PortChannel1 within the SONiC teamd docker, which is responsible for link aggregation. On both Mellanox and Arista, no ports were present as slave of PortChannel1. Listing 4.2 shows this for Mellanox. It was confirmed by a SONiC collaborator that port channel functionality is currently aimed at Layer 3 port channels and not at Layer 2 port channels6.

a d m i n @ s o n i c - m e l l a n o x :~ $ d o c k e r e x e c - it 7 e 5 0 4 b d c 0 3 e 0 b a s h r o o t @ s o n i c - m e l l a n o x :/# t e a m d c t l P o r t C h a n n e l 1 s t a t e v i e w s e t u p : r u n n e r : l a c p r u n n e r : a c t i v e : yes f a s t r a t e : no

Listing 4.2: No slave ports present for PortChannel1.

Appendix C1.2 shows the configuration of the Layer 3 port channel. Layer 3 link aggregation did work correctly. In this case, PortChannel1 did go UP and allowed us to communicate between the interface IP addresses configured on PortChannel1 (172.16.0.10/24 on the Mellanox side and 172.16.0.20/24 on the Arista side). Also, teamdctl now showed us the correct slave ports for PortChannel1.

In addition, Layer 3 link aggregation passed our redundancy tests. Listing 4.3 shows an example of this (some output has been omitted for brevity). Appendix D2 shows more complete output of the example. In listing 4.3, it can be seen on lines 4-14 that when both links are operational, communication through the port channel is possible. In lines 20-21, one can see that one of the links is not operational anymore. Line 27 shows that despite the fact that one link broke down, PortChannel1 is still operational. Next, lines 32-36 show that communication is possible through the port channel even when one of the two links is not operational. This indicates that Layer 3 link aggregation provides a redundant connection between Mellanox and Arista.

(23)

1 # ( b o t h l i n k s up ) 2 3 # ( . . . ) 4 E t h e r n e t 3 2 32 ,33 ,34 ,35 N / A 9 1 0 0 E t h e r n e t 3 2 up up 5 E t h e r n e t 3 6 36 ,37 ,38 ,39 N / A 9 1 0 0 E t h e r n e t 3 6 up up 6 # ( . . . ) 7 8 # ( c o m m u n i c a t i o n is f i n e ) 9 10 a d m i n @ s o n i c - m e l l a n o x :~ $ s u d o p i n g 1 7 2 . 1 6 . 0 . 2 0 11 P I N G 1 7 2 . 1 6 . 0 . 2 0 ( 1 7 2 . 1 6 . 0 . 2 0 ) 5 6 ( 8 4 ) b y t e s of d a t a . 12 64 b y t e s f r o m 1 7 2 . 1 6 . 0 . 2 0 : i c m p _ s e q =1 ttl =64 t i m e = 0 . 3 6 2 ms 13 64 b y t e s f r o m 1 7 2 . 1 6 . 0 . 2 0 : i c m p _ s e q =2 ttl =64 t i m e = 0 . 2 2 0 ms 14 64 b y t e s f r o m 1 7 2 . 1 6 . 0 . 2 0 : i c m p _ s e q =3 ttl =64 t i m e = 0 . 3 2 0 ms 15 # ( . . . ) 16 17 # ( l i n k f a i l u r e E t h e r n e t 3 2 !) 18 19 # ( . . . ) 20 E t h e r n e t 3 2 32 ,33 ,34 ,35 N / A 9 1 0 0 E t h e r n e t 3 2 d o w n up 21 E t h e r n e t 3 6 36 ,37 ,38 ,39 N / A 9 1 0 0 E t h e r n e t 3 6 up up 22 # ( . . . ) 23 24 # ( P o r t C h a n n e l 1 s t i l l UP ) 25 26 # ( . . . ) 27 4: P o r t C h a n n e l 1 : < B R O A D C A S T , M U L T I C A S T , UP , L O W E R _ U P > mtu 9 1 0 0 q d i s c n o q u e u e s t a t e UP g r o u p d e f a u l t 28 # ( . . . ) 29 30 # ( c o m m u n i c a t i o n s t i l l w o r k i n g ) 31 32 a d m i n @ s o n i c - m e l l a n o x :~ $ s u d o p i n g 1 7 2 . 1 6 . 0 . 2 0 33 P I N G 1 7 2 . 1 6 . 0 . 2 0 ( 1 7 2 . 1 6 . 0 . 2 0 ) 5 6 ( 8 4 ) b y t e s of d a t a . 34 64 b y t e s f r o m 1 7 2 . 1 6 . 0 . 2 0 : i c m p _ s e q =1 ttl =64 t i m e = 0 . 2 9 4 ms 35 64 b y t e s f r o m 1 7 2 . 1 6 . 0 . 2 0 : i c m p _ s e q =2 ttl =64 t i m e = 0 . 3 1 0 ms 36 64 b y t e s f r o m 1 7 2 . 1 6 . 0 . 2 0 : i c m p _ s e q =3 ttl =64 t i m e = 0 . 2 9 6 ms 37 # ( . . . )

Listing 4.3: Behaviour of PortChannel1 when link Ethernet32 (Mellanox)-Ethernet16 (Arista) fails (some output has been omitted for brevity).

STP Methods

We also examined the Spanning Tree Protocol. Considering the topology in figure 4.6, if no STP is configured at all, a broadcast storm may occur because there is a network loop. Broadcast messages sent from Mellanox to Arista out of both Ethernet32 and Ethernet36 will be circulated because Arista will broadcast these back to Mellanox through Ethernet20 and Ethernet16, re-spectively, leading to an endless broadcast storm. STP must prevent this by selecting a Root Bridge and blocking one of the links between Mellanox and Arista.

Mellanox SN2100 Ethernet32 Ethernet36 Arista 7050QX-32S Ethernet16 Ethernet20 VLAN 100

Figure 4.6: Topology to analyze STP.

The current version of SONiC does not have support for STP, but we decided to try to con-figure STP nevertheless using brctl. We placed the relevant ports in VLAN 100 and we first set the priority of the Mellanox bridge to 100 and of the Arista bridge to 200, meaning that Mellanox should be selected as Root Bridge and thus one of the Arista ports should be set in blocking state (lower priority indicates higher chance to be selected as Root Bridge). Also, we

(24)

reversed the priorities so that Arista should be selected as Root Bridge and one of the Mellanox ports should be in blocking state. The relevant SONiC configurations (of the VLANs) can be found in Appendix C2.

Results

In both our tests the devices selected themselves as Root Bridge, meaning that all ports were set in forwarding state and the loop in our topology remained existent. Listings 4.4 and 4.5 show the results of the STP status on Mellanox and Arista (some output has been omitted for brevity). The full output can be found in Appendix D3. One can notice on lines 8 and 12 of both listings that all four ports are in forwarding state and in both cases the devices selected themselves as Root Bridge, which can be concluded from lines 3 and 4 in both listings.

1 a d m i n @ s o n i c - m e l l a n o x :~ $ s u d o b r c t l s h o w s t p B r i d g e 2 B r i d g e 3 b r i d g e id 0 0 6 4 . e c 0 d 9 a 8 d f 1 c 0 4 d e s i g n a t e d r o o t 0 0 6 4 . e c 0 d 9 a 8 d f 1 c 0 5 # ( . . . ) 6 7 E t h e r n e t 3 2 (1) 8 p o r t id 8 0 0 1 s t a t e f o r w a r d i n g 9 # ( . . . ) 10 11 E t h e r n e t 3 6 (2) 12 p o r t id 8 0 0 2 s t a t e f o r w a r d i n g 13 # ( . . . )

Listing 4.4: Mellanox selected itself as Root Bridge (some output was omitted for brevity).

1 a d m i n @ s o n i c - a r i s t a :~ $ s u d o b r c t l s h o w s t p B r i d g e 2 B r i d g e 3 b r i d g e id 00 c8 . 0 0 1 c 7 3 7 b f 7 5 c 4 d e s i g n a t e d r o o t 00 c8 . 0 0 1 c 7 3 7 b f 7 5 c 5 # ( . . . ) 6 7 E t h e r n e t 1 6 (1) 8 p o r t id 8 0 0 1 s t a t e f o r w a r d i n g 9 # ( . . . ) 10 11 E t h e r n e t 2 0 (2) 12 p o r t id 8 0 0 2 s t a t e f o r w a r d i n g 13 # ( . . . )

Listing 4.5: Arista selected itself as Root Bridge (some output was omitted for brevity).

To investigate this behaviour, we used tcpdump to capture STP messages between Mellanox and Arista. We found that both devices were sending STP messages to each other, but no receiving STP messages were passed to the control plane. It was confirmed by a SONiC contributor that in the current version of SONiC, no interface trap is configured for STP messages, meaning that incoming STP messages are not passed to the control plane and thus STP is unable to operate correctly on SONiC7.

VLAN (trunking)

Methods

Additionally, we examined VLAN (trunking). We decided that for this experiment, and the other experiments following this one, it would be interesting to use several hosts in our testing topologies. In order to do so, we attached two physical servers to our topology and set up two virtual machines (VMs) to run on each host server. We used the same operating system on all four VMs (Linux 4.9.82-1+deb9u3).

We used Vagrant8 to build and manage our virtual machine environment, and VirtualBox9 for the virtual machine itself. Appendix B2 and B3 specify the Vagrant and VirtualBox versions

(25)

used in the experiments, respectively. To allow the VMs to participate in the network, we configured a bridge between the physical interfaces of the servers and the virtual interfaces of the VMs using the Vagrantfiles. The virtual interfaces could then be configured as if they were physical interfaces on four separate machines. The Vagrantfiles used in this experiment can also be found in the previously mentioned GitHub repository.

To verify whether the switches are able to perform correct VLAN (trunking) functionality, we used the topology depicted in figure 4.7. For the SONiC configuration, Appendix C3 can be consulted. The link between Mellanox and Arista should be configured as a ”tagged” (trunk) link, for it must carry frames that can belong to either VLAN 100 or VLAN 200. We examined whether the open switches allowed us to configure such ”tagged” ports and whether packets can be delivered correctly within the same VLANs using the trunk link. For instance, VM A1 must be able to communicate with VM B1, because they are configured to be in the same VLAN. VM A1 should not be able to communicate with VM B2, because they are in different VLANs and should therefore be completely isolated from each other.

Ethernet56 Phys. server A VM A1 VM A2 VLAN trunk Ethernet60 172.16.100.1/24 172.16.200.1/24 VLAN 200 VLAN 100 Arista 7050QX-32S Ethernet40 Ethernet16 Phys. server B VM B1 VM B2 Ethernet44 172.16.100.2/24 172.16.200.2/24 VLAN 200 VLAN 100 Mellanox SN2100 Ethernet32 VLAN 100: 172.16.100.3/24 VLAN 200: 172.16.200.3/24 VLAN interfaces VLAN 100: 172.16.100.4/24 VLAN 200: 172.16.200.4/24 VLAN interfaces

Figure 4.7: Topology to analyze VLAN (trunking) and inter-VLAN routing. Two VLANs are configured with two virtual machines in each VLAN. A trunk link is present between Mellanox and Arista.

Results

We were able to communicate within VLANs, but not between VLANs, which is the correct and expected behaviour. As an example, listing 4.6 shows with ping that communication is possible from VM A1 to VM B1, which are in the same VLAN.

v a g r a n t @ v m A 1 :~ $ p i n g 1 7 2 . 1 6 . 1 0 0 . 2

P I N G 1 7 2 . 1 6 . 1 0 0 . 2 ( 1 7 2 . 1 6 . 1 0 0 . 2 ) 5 6 ( 8 4 ) b y t e s of d a t a . 64 b y t e s f r o m 1 7 2 . 1 6 . 1 0 0 . 2 : i c m p _ s e q =1 ttl =64 t i m e = 0 . 6 0 2 ms 64 b y t e s f r o m 1 7 2 . 1 6 . 1 0 0 . 2 : i c m p _ s e q =2 ttl =64 t i m e = 0 . 5 8 0 ms 64 b y t e s f r o m 1 7 2 . 1 6 . 1 0 0 . 2 : i c m p _ s e q =3 ttl =64 t i m e = 0 . 5 9 1 ms

Listing 4.6: ping from VM A1 to VM B1.

In contrast, communication between devices configured in different VLANs is not possible. List-ing 4.7 displays this behaviour. The example shows that we cannot communicate between VLAN

(26)

100 to VLAN 200. v a g r a n t @ v m A 1 :~ $ p i n g 1 7 2 . 1 6 . 2 0 0 . 2 P I N G 1 7 2 . 1 6 . 2 0 0 . 2 ( 1 7 2 . 1 6 . 2 0 0 . 2 ) 5 6 ( 8 4 ) b y t e s of d a t a . F r o m 1 7 2 . 1 6 . 1 0 0 . 1 i c m p _ s e q =1 D e s t i n a t i o n H o s t U n r e a c h a b l e F r o m 1 7 2 . 1 6 . 1 0 0 . 1 i c m p _ s e q =2 D e s t i n a t i o n H o s t U n r e a c h a b l e F r o m 1 7 2 . 1 6 . 1 0 0 . 1 i c m p _ s e q =3 D e s t i n a t i o n H o s t U n r e a c h a b l e

Listing 4.7: ping from VM A1 to VM B2.

This indicates that the VLAN trunk succeeded in tagging frames with the correct VLAN number, allowing for complete VLAN isolation shown in the above listings.

4.2.2

Layer 3 features

Inter-VLAN routing Methods

Besides VLAN trunking functionality, we also tested inter-VLAN routing using the same topol-ogy as depicted in figure 4.7. Figure 4.7 shows the VLAN interface addresses we used for both Mellanox and Arista. For the relevant sections of the SONiC configuration, Appendix C4 can be consulted. If inter-VLAN routing works correctly, the switches will route packets from one VLAN to another and thus we should be able to communicate between VLANs using the VLAN interfaces. For instance, VM A1 should be able to ping VM B2 and vice versa.

Results

Using inter-VLAN routing we are able to communicate between different VLANs. Listing 4.8 shows an example of a traceroute from VM A1 to VM B2. It shows on line 3 that the VLAN interface of VLAN 100 on Mellanox (172.16.100.3) is used to route the packet to VLAN 200, after which Arista delivers the packet to VM B2.

1 v a g r a n t @ v m A 1 :~ $ t r a c e r o u t e 1 7 2 . 1 6 . 2 0 0 . 2

2 t r a c e r o u t e to 1 7 2 . 1 6 . 2 0 0 . 2 ( 1 7 2 . 1 6 . 2 0 0 . 2 ) , 30 h o p s max , 60 b y t e p a c k e t s 3 1 1 7 2 . 1 6 . 1 0 0 . 3 ( 1 7 2 . 1 6 . 1 0 0 . 3 ) 0 . 3 9 9 ms 0 . 3 0 7 ms 0 . 2 2 7 ms

4 2 1 7 2 . 1 6 . 2 0 0 . 2 ( 1 7 2 . 1 6 . 2 0 0 . 2 ) 0 . 6 9 6 ms 0 . 6 1 9 ms 0 . 5 2 6 ms

Listing 4.8: traceroute from VM A1 to VM B2.

BGP Methods

To verify whether BGP operates correctly on the switches, we set up the topology in figure 4.8. SONiCs BGP configuration can be found in Appendix C5. We have configured the switches to be in different autonomous systems, so as to create a BGP session between Mellanox and Arista. We expect this BGP session to exchange routes between Mellanox and Arista. Specifically, Mel-lanox has knowledge of subnets 10.0.0.0/24 and 10.0.1.0/24 and Arista does not. BGP must propagate the routes to these subnets to Arista. Similarly, BGP must propagate routes to the subnets 10.0.4.0/24 and 10.0.5.0/24 to Mellanox. In the end, we want to be able to commu-nicate between the virtual machines in hosts A and B using these propagated routes.

(27)

Mellanox SN2100 10.0.0.10 10.0.2.10 10.0.3.10 Phys. server A AS 65000 subnet 10.0.3.0/24 VM A1 VM A2 subnet 10.0.2.0/24 10.0.1.10 10.0.0.1 10.0.1.1 subnet 10.0.1.0/24 subnet 10.0.0.0/24 Arista 7050QX-32S 10.0.4.20 10.0.2.20 10.0.3.20 Phys. server B AS 65100 VM B1 VM B2 10.0.5.20 10.0.4.2 10.0.5.2 subnet 10.0.5.0/24 subnet 10.0.4.0/24

Figure 4.8: Topology to analyze BGP. Mellanox and Arista are placed in two separate au-tonomous systems, each containing two subnets.

Results

SONiC succesfully set up the two BGP sessions we configured. For example, listing 4.9 shows the current BGP sessions Mellanox has with Arista on lines 8 and 9.

1 a d m i n @ s o n i c - m e l l a n o x :~ $ s h o w ip bgp s u m m a r y 2 C o m m a n d : s u d o v t y s h - c " s h o w ip bgp s u m m a r y " 3 BGP r o u t e r i d e n t i f i e r 1 0 . 1 . 0 . 1 0 , l o c a l AS n u m b e r 6 5 0 0 0 4 RIB e n t r i e s 21 , u s i n g 2 3 5 2 b y t e s of m e m o r y 5 P e e r s 2 , u s i n g 9 3 1 2 b y t e s of m e m o r y 6 7 N e i g h b o r V AS M s g R c v d M s g S e n t T b l V e r InQ O u t Q Up / D o w n S t a t e / P f x R c d 8 1 0 . 0 . 2 . 2 0 4 6 5 1 0 0 34 37 0 0 0 0 0 : 3 0 : 3 0 8 9 1 0 . 0 . 3 . 2 0 4 6 5 1 0 0 38 41 0 0 0 0 0 : 3 0 : 2 7 8 10 11 T o t a l n u m b e r of n e i g h b o r s 2

Listing 4.9: BGP neighbors on Mellanox, showing the two sessions with Arista.

In addition, Mellanox and Arista successfully exchanged routing information regarding the sub-nets that were directly connected to them. Mellanox propagated the subsub-nets 10.0.0.0/24 and 10.0.1.0/24 to Arista and Arista propagated the subnets 10.0.4.0/24 and 10.0.5.0/24 to Mellanox. For instance, listing 4.10 shows that subnets 10.0.4.0/24 (line 11) and 10.0.5.0/24 (line 13) are present in the routing table on Mellanox, and that these subnets can be reached through both 10.0.2.20 (via interface Ethernet32) and 10.0.3.20 (via interface Ethernet36) and that this reachability was sourced from BGP. Similarly, in the Arista routing table, subnets 10.0.0.0/24 and 10.0.1.0/24 are present and can be reached through 10.0.2.10 (via inter-face Ethernet16) and 10.0.3.10 (via interinter-face Ethernet20). Again, these routes were learned through BGP.

(28)

1 a d m i n @ s o n i c - m e l l a n o x :~ $ s h o w ip r o u t e 2 C o m m a n d : s u d o v t y s h - c " s h o w ip r o u t e "

3 C o d e s : K - k e r n e l route , C - c o n n e c t e d , S - static , R - RIP , 4 O - OSPF , I - IS - IS , B - BGP , P - PIM , A - Babel , 5 > - s e l e c t e d route , * - FIB r o u t e 6 7 C >* 1 0 . 0 . 0 . 0 / 2 4 is d i r e c t l y c o n n e c t e d , E t h e r n e t 5 6 8 C >* 1 0 . 0 . 1 . 0 / 2 4 is d i r e c t l y c o n n e c t e d , E t h e r n e t 6 0 9 C >* 1 0 . 0 . 2 . 0 / 2 4 is d i r e c t l y c o n n e c t e d , E t h e r n e t 3 2 10 C >* 1 0 . 0 . 3 . 0 / 2 4 is d i r e c t l y c o n n e c t e d , E t h e r n e t 3 6 11 B >* 1 0 . 0 . 4 . 0 / 2 4 [ 2 0 / 0 ] via 1 0 . 0 . 2 . 2 0 , E t h e r n e t 3 2 , src 1 0 . 1 . 0 . 1 0 , 0 0 : 3 0 : 0 5 12 * via 1 0 . 0 . 3 . 2 0 , E t h e r n e t 3 6 , src 1 0 . 1 . 0 . 1 0 , 0 0 : 3 0 : 0 5 13 B >* 1 0 . 0 . 5 . 0 / 2 4 [ 2 0 / 0 ] via 1 0 . 0 . 2 . 2 0 , E t h e r n e t 3 2 , src 1 0 . 1 . 0 . 1 0 , 0 0 : 3 0 : 0 5 14 * via 1 0 . 0 . 3 . 2 0 , E t h e r n e t 3 6 , src 1 0 . 1 . 0 . 1 0 , 0 0 : 3 0 : 0 5 15 C >* 1 0 . 1 . 0 . 1 0 / 3 2 is d i r e c t l y c o n n e c t e d , lo 16 B >* 1 0 . 1 . 0 . 2 0 / 3 2 [ 2 0 / 0 ] via 1 0 . 0 . 2 . 2 0 , E t h e r n e t 3 2 , src 1 0 . 1 . 0 . 1 0 , 0 0 : 3 0 : 0 5 17 * via 1 0 . 0 . 3 . 2 0 , E t h e r n e t 3 6 , src 1 0 . 1 . 0 . 1 0 , 0 0 : 3 0 : 0 5 18 # ( . . . )

Listing 4.10: Routing table on Mellanox, showing the routes to the subnets we configured before (some output was omitted for brevity).

Ultimately, to complete the verification of BGP, we used traceroute between the virtual ma-chines to determine whether communication is possible and if yes, what routes are taken. Listing 4.11 shows that VM A1 and VM B2 are able to communicate through Mellanox and Arista. First, a packet goes through the connected interface on the Mellanox device (10.0.0.10) (line 3). Then, it is routed to the Arista device (10.1.0.20) (line 4), which routes it via its interface in subnet 10.0.5.0/24. SONiC uses the loopback address configured in config db.json as Router ID in the BGP sessions, which for Mellanox is 10.1.0.10/32 and for Arista 10.1.0.20/32.

1 v a g r a n t @ v m A 1 :~ $ s u d o t r a c e r o u t e 1 0 . 0 . 5 . 2

2 t r a c e r o u t e to 1 0 . 0 . 5 . 2 ( 1 0 . 0 . 5 . 2 ) , 30 h o p s max , 60 b y t e p a c k e t s 3 1 1 0 . 0 . 0 . 1 0 ( 1 0 . 0 . 0 . 1 0 ) 0 . 3 3 3 ms 0 . 2 2 9 ms 0 . 2 5 8 ms

4 2 1 0 . 1 . 0 . 2 0 ( 1 0 . 1 . 0 . 2 0 ) 0 . 3 8 1 ms 0 . 2 6 4 ms 0 . 2 5 0 ms 5 3 1 0 . 0 . 5 . 2 ( 1 0 . 0 . 5 . 2 ) 0 . 5 2 3 ms 0 . 4 1 6 ms 0 . 3 1 2 ms

Listing 4.11: traceroute from VM A1 to VM B2.

4.2.3

Result overview

Table 4.1 presents an overview of the obtained results, including several remarks.

Feature Results Comments

LLDP Pass

-LACP Pass L2 link aggregation not working

STP Fail Not supported by SONiC; packets dropped VLAN (trunk) Pass

-Inter-VLAN routing Pass

-BGP Pass

(29)

CHAPTER 5

Discussion

The performed experiments provided insights into the shortcomings of several fundamental net-working features when deployed on open switches with SONiC. Aside from commenting on some of the obtained results, this chapter will provide general insights into the ease of use of open switches with SONiC.

LACP

The first notable observation in our LACP experiments was that none of our port channels were showing up when issuing the command show interfaces portchannel. This command should show all configured port channels in SONiC. It turns out SONiC takes the configuration of minigraph.xml, which is a deprecated configuration method of SONiC, for the command show interfaces portchannel1instead of the main configuration file config db.json.

The Layer 3 port channel we set up in the experiments has shown that it is possible to deploy a Layer 3 port channel that provides redundant communication between the devices running SONiC. In order to get this working, aside from configuring the port channel in SONiCs configuration file, we had to edit SONiCs teamd template2. By default, SONiC sets the field

min ports, which is the minimum amount of active ports in a port channel for the port channel to be operational, to ceil(0.75∗number of port channel members), which was 2 in our topology. In order to test the redundancy of our port channel, we set min ports to 1.

BGP

Similarly, successful BGP functionality was implied by the experiments we performed. Configur-ing BGP neighbors and AS numbers was done in config db.json. Nonetheless, we had to edit the BGP configuration template3 of SONiC to specify that BGP should be configured such that

it also advertises directly connected networks (which was the case in our topology) and not only networks obtained from other BGP neighbors.

5.1

Ease of use

SONiC allowed for straightforward configuration in the config db.json configuration file. Sev-eral examples of configuration have been shown in this thesis already. Moreover, SONiC provided some CLI configuration commands but not for all features we tested. For instance, configuration commands could be used to shut down interfaces and BGP sessions, but not to set interface addresses or configure BGP peers. Command-line configuration also allowed the user to set up VLANs and add or remove interfaces. In general, the command-line configuration provided a small subset of configuration possibilities compared to configuration in config db.json. For

1https://github.com/Azure/sonic-utilities/blob/master/scripts/teamshow

2https://github.com/Azure/sonic-buildimage/blob/master/dockers/docker-teamd/teamd.j2

(30)

users that prefer command-line configuration this might make SONiC less straightforward to use.

SONiC did provide a wide variety of commands to view current configuration or device sta-tus. Several examples have been provided in this research. For all supported (thus not STP) features we tested there existed one or more corresponding show commands to view configu-ration or state information. One exception, as mentioned before, was that show interfaces portchannel takes its information from a deprecated configuration method to show the config-ured port channels.

In short, the ease of use of SONiC depends on the preferences of the user. Users that prefer working in a configuration file will probably experience SONiC as a user-friendly NOS, while users that prefer CLI configuration might be less comfortable with SONiCs configuration.

(31)

CHAPTER 6

Use case scenarios for open switches

The use of open switches with SONiC can be beneficial in various use cases. For example, SONiC allows the user to look into the source code and then use or extend it for a certain objective. In this chapter, we briefly examine a current use case that exploits the properties of open switches with SONiC, and we provide two example use cases ourselves.

6.1

Current use cases

Microsoft Global Cloud

Microsoft developed SONiC for use in their Microsoft Global Cloud. Since Microsoft runs one of the largest networks in the world, Microsoft sets out strict requirements for the technology that it deploys. One of these requirements is that new features can be implemented without having an impact on the end users. Since SONiC consists of several containers, each containing the resources to deploy a certain networking feature (e.g. BGP, LAG), only one container needs to be updated when there is an update or a bug fix for a certain feature, instead of replacing the whole switch image (which will result in data plane downtime). This makes SONiC suitable for use cases in which no downtime is permitted while deploying updates [13].

Another requirement is that SONiC can be used on the newest and most innovated hardware platforms. Because SONiC uses SAI, a data center can constantly innovate with newer switch hardware without having to change the software stack. Regardless of what switch hardware is used, as long as SAI is supported, all switches can be configured the same way, for their software stacks can be identical. Operators are thus able to preserve their software investments while keeping up with hardware-wise innovation. Thus, SONiC is suitable for use cases in which there may be regular changes in hardware [13].

In addition, Microsoft states the requirement that cloud-scale deep telemetry and fully au-tomated failure mitigation has to be utilized in their cloud. Since SONiC has innovations such as NetBouncer and Everflow available, SONiC meets these requirements. NetBouncer can be deployed to automatically detect faulty devices or links within a large data center accurately [27]. Everflow debugs several network faults such as packet drops or loops. Furthermore, it can quickly identify devices that cause high latency in a network [28]. In short, SONiC is suitable for use cases in which large networks have to be monitored constantly to automatically respond to potential problems.

6.2

Example use cases

In addition to the above use case of SONiC in the Microsoft Global Cloud, we provide two example use cases in which the open nature of SONiC can be deployed for useful applications.

(32)

Plug-and-play VLAN

The first use case relates to companies that want to provide flexible working spots to their employees, who in turn own a laptop provided by the company. On different days the same employee might want to work on different physical locations within the company office. To gain access to the company’s network, the employees can use Ethernet cables to connect their laptops with the network. Since generally a company consists of several departments with partitioned and isolated computer networks (VLANs) and an employee from any department can sit and connect anywhere, how does the network decide to which VLAN the newly connected laptop should be added? Employees want to start working right away after connecting to the network, so it is preferred their laptops are added to the correct VLAN without intervention by the network administrator.

If the company network consists of open switches running SONiC, the network administrator could decide to develop an application that automatically places newly connected hosts in a particular VLAN, depending on the MAC address of the host, so that the VLAN functionality on the switch is essentially plug-and-play. That is, no manual configuration is needed every time a new host connects.

This can be implemented as follows. Via SONiCs logs or LLDP entries, newly connected hosts (including their MAC address) can be discovered by the application. The network administrator can define a policy that states the relation between the host MAC address and the VLAN that host will be placed in, if at all. This is illustrated in the right side of figure 6.1. For instance, the application can decide to block hosts with unknown MAC addresses, or place hosts manufactured by the same vendor in the same VLAN (MAC addresses are vendor-specific). Next, the application can simply edit SONiCs config db.json configuration file to change the VLAN settings.

We suppose that the network administrator has a list of all company laptops in use by the employees, including their MAC addresses. The network administrator can easily define a policy that maps each laptop to a certain department (VLAN). This way, no matter where an employee connects its laptop to the network, the SONiC switch will detect the MAC address of this laptop and determine in which VLAN the switch interface this new host is connected to should be placed. Similarly, if a laptop not belonging to one of the employees uses one of the Ethernet cables, its MAC address is not recognized and can be blocked by the switch.

Host detection on interface y

SONiC Policy table

Laptop with MAC x

y, MAC x Add interface y to VLAN z Block y, VLAN z Block Other laptops Office Switch

(33)

Automatic port channel

Our second use case focuses on data centers that need to provide a redundant network, because they handle important data that must reach its destination. The network administrator can decide to aggregate all connections between each switch in the data center, to provide redundancy. Such a data center may have a large number of switches, and configuring link aggregation for the connections between these switches can by time-consuming.

LACP provides a way to do this, because it is able to automatically negotiate between two devices to set up a port channel. However, before this can happen, the network administrator needs to specify which switch interfaces belong to which port channel, if at all. The application in the next paragraph proposes a more flexible alternative for this LACP feature by allowing the network administrator to connect neighbors to any interface on a switch without the need of specifying that these interfaces will be used to create a port channel. The application will dynamically determine whether a port channel should be set up, and if so, with which interfaces. If the data center contains open switches with SONiC, the network administrator can develop an application that automatically sets up a port channel if it discovers that the switch has multiple physical links with the same neighbor (for instance by comparing MAC addresses or Chassis IDs of all connected devices). This way, when the network administrator connects two switches with multiple physical links, no further configuration is necessary to set up the port channel.

y, PC z y, ID x Neighbor detection

on interface y

SONiC Policy table

Switch with ID x

Create new port channel z Add to existing port

channel z Other switches Datacenter Switch No port channel modifications

Figure 6.2: An overview of the automatic port channel application. The switch detects a new neighbor with Chassis ID x on switch interface y. Using neighbor and port channel information, the application decides whether 1) there should be no port channel modifications because the new link is the only link with the neighbor device, or 2) a new port channel z should be created with interface y and the other interface already connected to the same neighbor, or 3) the new interface y should be added to port channel z because there already is a port channel with the neighbor.

When the switch detects a newly connected neighbor, the application uses neighbor and port channel information to determine whether there is already a port channel with this neighbor and if not, whether there is already a single link with this neighbor, for instance by comparing the Chassis ID of all connected neighbors. Figure 6.2 shows how this determination process. There are three possible cases:

1. There is no other link with the neighbor device.

(34)

3. There is already a port channel with the neighbor device.

In case 1, no port channel needs to be created since it would not provide redundancy. In case 2, a port channel needs to be created with the interface of the already present link and the interface of the new link. In case 3, the interface of the new link needs to be added to the port channel that is already present with the neighbor. Editing port channels can be done by editing SONiCs configuration file. To set up a working port channel, the neighbor device will also have to go through this same process to set up the port channel on the neighbors side. Thus, the application must run on both devices.

(35)

CHAPTER 7

Conclusions

Open switches play an increasingly significant role in innovative networks. We have examined the deployment of several networking features that we believe are essential for a network switch, and did so using multiple topologies with open switches to simulate different scenarios. In the discussion, we provided remarks about some experimental results and shed some light on the ease of use of a SONiC switch. In the last chapter, we briefly examined a current use case of open switches with SONiC and provided two possible use cases ourselves.

We can conclude that the networking features that we tested and SONiC states to support, LLDP, LACP (Layer 3), VLAN trunking, inter-VLAN routing and BGP, can in fact be success-fully deployed on open switches with SONiC. In the discussion, we stated that some of these features required additional configuration, but the features were deployed successfully after all.

In addition, we can conclude that open switches with SONiC are suitable for use cases in large data centers as well as in enterprise networks, because SONiC allows the network administrator to update features without data plane downtime. In addition, the network administrator is able to keep up with hardware innovations while no changes to the software stack need to be made. Also, SONiC provides innovative applications to monitor the network and automatically detect network failures. The openness of SONiC gives the network administrator more possibilities to customize and extend software on the switch.

7.1

Future research

In future research, more (different) open switches can be used to experiment with. Examples are several Dell and Edgecore devices, which are supported by SONiC. This would allow us to set up more interesting and realistic network topologies to test features with. In addition, since SONiC develops constantly, retesting with SONiC after a certain amount of time could in principle provide different results.

Furthermore, other open-source network operating systems such as OPX can be tested as well, to allow for a comparison with SONiC. Interoperability between SONiC and traditional, closed network operating systems can also be performed in future research, since a network generally consists of devices with different operating systems.

Lastly, future research can be done on the use cases of SONiC. Additional use cases can be examined and the described example use cases can be implemented and tested.

(36)
(37)

CHAPTER 8

Acknowledgements

I would like to thank my supervisors, Paola Grosso and Lukasz Makowski, for their support and guidance throughout this research and the writing of this thesis. Moreover, I would like to thank the SNE OpenLab for providing the devices that were required in this research.

(38)
(39)

CHAPTER 9

Bibliography

[1] Neil Briscoe. “Understanding the OSI 7-layer model”. In: PC Network Advisor 120.2 (2000). [2] Zeus Kerravala. White boxes are now ready for prime time. Consulted on 03-04-2018. 2016. url: https : / / www . networkworld . com / article / 3100927 / network switch / white -boxes-are-now-ready-for-prime-time.html.

[3] Mike Sheldon. The future of networking: Its in a white box. Consulted on 03-04-2018. 2017. url: https://www.networkworld.com/article/3182138/hardware/the- future- of-networking-its-in-a-white-box.html.

[4] Margaret Rouse. ASIC (application-specific integrated circuit). Consulted on 09-04-2018. 2005. url: https://whatis.techtarget.com/definition/ASIC-application-specific-integrated-circuit.

[5] Microsoft Azure. Software for Open Networking in the Cloud SONiC. Consulted on 03-04-2018. 03-04-2018. url: http://azure.github.io/SONiC/.

[6] Ivan Pepelnjak. Management, Control and Data Planes in Network Devices and Systems. Consulted on 09-04-2018. 2013. url: http://blog.ipspace.net/2013/08/management-control-and-data-planes-in.html.

[7] Jeff Doyle. Clearing the fog around open switching terminology. Consulted on 09-04-2018. 2015. url: https : / / www . networkworld . com / article / 2919599 / cisco - subnet / clearing-the-fog-around-open-switching-terminology.html.

[8] Kamala Subramaniam. Switch Abstraction Interface (SAI) officially accepted by the Open Compute Project (OCP). Consulted on 12-04-2018. 2015. url: https://azure.microsoft. com/en- us/blog/switch- abstraction- interface- sai- officially- accepted- by-the-open-compute-project-ocp/.

[9] Mellanox Technologies. Switch Software Development Kit. Consulted on 14-04-2018. 2018. url: http : / / www . mellanox . com / page / products _ dyn ? product _ family = 124 & mtag = switchx_sdk.

[10] Broadcom. OpenNSL. Consulted on 14-04-2018. 2018. url: https://www.broadcom.com/ products/ethernet-connectivity/software/opennsl/.

[11] Lihua Yuan. Features and Roadmap. Consulted on 16-04-2018. 2018. url: https : / / github.com/Azure/SONiC/wiki/Features-and-Roadmap.

[12] Microsoft Azure. Architecture. Consulted on 16-04-2018. 2017. url: https://github.com/ Azure/SONiC/wiki/Architecture.

[13] Yousef Khalidi. SONiC: The networking switch software that powers the Microsoft Global Cloud. Consulted on 01-06-2018. 2017. url: https://azure.microsoft.com/en- us/ blog / sonic the networking switch software that powers the microsoft -global-cloud/.

(40)

[14] James F. Kurose and Keith W. Ross. Computer Networking: A Top-Down Approach. Sixth edition. Pearson, 2013.

[15] “IEEE Standard for Local and metropolitan area networks - Station and Media Access Con-trol Connectivity Discovery”. In: IEEE Std 2016 (Revision of IEEE Std 802.1AB-2009) (2016), pp. 1–146. doi: 10.1109/IEEESTD.2016.7433915.

[16] Mick Seaman. “Link aggregation control protocol”. In: IEEE http://grouper. ieee. org/-groups/802/3/ad/public/mar99/seaman 1 (1999), p. 0399.

[17] Alan Elder and Jonathan Harrison. Spanning tree protocol. US Patent App. 14/673,652. 2015.

[18] “IEEE Standard for Local and metropolitan area networks: Media Access Control (MAC) Bridges”. In: IEEE Std 802.1D-2004 (Revision of IEEE Std 802.1D-1998) (2004), pp. 1– 281. doi: 10.1109/IEEESTD.2004.94569.

[19] Spanning Tree Concepts. Consulted on 06-06-2018. 2016. url: https : / / seeseenayy . blogspot.com/2016/09/ccnav3-chapter-2-notes-spanning-tree.html.

[20] David J Husak. Direct addressing between VLAN subnets. US Patent 6,157,647. 2000. [21] ExamCollection. How to verify and configure VLANs trunking. Consulted on 03-05-2018.

2018. url: https : / / www . examcollection . com / certification training / ccnp -configure-and-verify-vlans-and-trunking.html.

[22] CNNA Blog. Introduction to inter-vlan routing. Consulted on 06-06-2018. 2018. url: http: //www.ccnablog.com/inter-vlan-routing/.

[23] Kunihiro Ishiguro et al. System Architecture. Consulted on 16-04-2018. 2005. url: https: //www.quagga.net/docs/quagga.html#System-Architecture.

[24] Onsel Kuluk. BGP in the Data Center: Part One. Consulted on 07-06-2018. 2018. url: https://www.packetdesign.com/blog/bgp-in-the-data-center-part-1/.

[25] Cumulus Networks. Open Network Install Environment. Consulted on 19-04-2018. 2018. url: https://opencomputeproject.github.io/onie/.

[26] Arista Networks. Boot Loader Aboot. Consulted on 20-04-2018. 2018. url: https://www. arista.com/ko/um-eos/eos-6-1-boot-loader--aboot.

[27] Microsoft. CloudBrain for Automatic Troubleshooting for the Cloud. Consulted on 05-06-2018. 2016. url: https://www.microsoft.com/en-us/research/project/cloudbrain/. [28] et al. Yibo Zhu. “Packet-level telemetry in large datacenter networks”. In: ACM SIGCOMM

(41)

CHAPTER 10

Appendix

A: Platform summary

Listings 10.1 and 10.2 show a platform summary of Mellanox SN2100 and Arista 7050QX-32S, respectively.

Mellanox

admin@sonic−m e l l a n o x : ˜ $ show p l a t f o r m summary P l a t f o r m : x 8 6 6 4 −mlnx msn2100−r 0

HwSKU: ACS−MSN2100 ASIC : m e l l a n o x

Listing 10.1: Mellanox SN2100 device information.

Arista

admin@sonic−a r i s t a : ˜ $ show p l a t f o r m summary P l a t f o r m : x 8 6 6 4 −a r i s t a 7 0 5 0 q x 3 2 s

HwSKU: A r i s t a −7050−QX−32S ASIC : broadcom

(42)

B: Software information

B1: SONiC version

Listings 10.3 and 10.4 show information about the versions of SONiC that were used on Mellanox SN2100 and Arista 7050QX-32S, respectively.

Mellanox

admin@sonic−m e l l a n o x : ˜ $ show v e r s i o n

SONiC S o f t w a r e V e r s i o n : SONiC .HEAD.574 − e d 9 1 5 e 3 D i s t r i b u t i o n : Debian 8 . 1 0

K e r n e l : 3.16.0 −5 −amd64 B u i l d commit : e d 9 1 5 e 3

B u i l d d a t e : Wed Apr 4 0 6 : 5 8 : 4 8 UTC 2018 B u i l t by : j o h n a r @ j e n k i n s −worker −3

Docker i m a g e s :

REPOSITORY TAG IMAGE ID SIZE

d o c k e r −o r c h a g e n t −mlnx HEAD.574 − e d 9 1 5 e 3 3 2 2 0 3 5 9 8 a5bc 287 MB d o c k e r −o r c h a g e n t −mlnx l a t e s t 3 2 2 0 3 5 9 8 a5bc 287 MB d o c k e r −syncd−mlnx HEAD.574 − e d 9 1 5 e 3 b7ab299d2758 3 5 0 . 9 MB

d o c k e r −syncd−mlnx l a t e s t b7ab299d2758 3 5 0 . 9 MB

d o c k e r −l l d p −s v 2 HEAD.574 − e d 9 1 5 e 3 d 8 f 6 5 d 1 2 b 4 0 6 2 9 7 . 2 MB d o c k e r −l l d p −s v 2 l a t e s t d 8 f 6 5 d 1 2 b 4 0 6 2 9 7 . 2 MB d o c k e r −dhcp−r e l a y HEAD.574 − e d 9 1 5 e 3 46 c2 a839 b5ba 2 8 0 . 1 MB

d o c k e r −dhcp−r e l a y l a t e s t 46 c2a8 39b5 ba 2 8 0 . 1 MB d o c k e r −d a t a b a s e HEAD.574 − e d 9 1 5 e 3 622 f 0 f 3 5 4 8 4 7 2 7 8 . 8 MB d o c k e r −d a t a b a s e l a t e s t 622 f 0 f 3 5 4 8 4 7 2 7 8 . 8 MB d o c k e r −teamd HEAD.574 − e d 9 1 5 e 3 9 d d 2 5 e 3 6 7 7 9 8 2 8 4 . 1 MB d o c k e r −teamd l a t e s t 9 d d 2 5 e 3 6 7 7 9 8 2 8 4 . 1 MB d o c k e r −snmp−s v 2 HEAD.574 − e d 9 1 5 e 3 4 b 4 2 7 7 d 6 c c 3 2 3 1 9 . 3 MB d o c k e r −snmp−s v 2 l a t e s t 4 b 4 2 7 7 d 6 c c 3 2 3 1 9 . 3 MB d o c k e r −r o u t e r −a d v e r t i s e r HEAD.574 − e d 9 1 5 e 3 6 c 2 a d f 7 7 4 3 e c 2 7 6 . 4 MB d o c k e r −r o u t e r −a d v e r t i s e r l a t e s t 6 c 2 a d f 7 7 4 3 e c 2 7 6 . 4 MB d o c k e r −p l a t f o r m −m o n i t o r HEAD.574 − e d 9 1 5 e 3 9 b 3 1 1 9 4 f f 8 1 2 2 9 8 . 3 MB d o c k e r −p l a t f o r m −m o n i t o r l a t e s t 9 b 3 1 1 9 4 f f 8 1 2 2 9 8 . 3 MB d o c k e r −fpm−quagga HEAD.574 − e d 9 1 5 e 3 67 c 7 7 e f c 3 d 4 a 2 9 0 . 6 MB d o c k e r −fpm−quagga l a t e s t 67 c 7 7 e f c 3 d 4 a 2 9 0 . 6 MB

(43)

Arista

admin@sonic−a r i s t a : ˜ $ show v e r s i o n

SONiC S o f t w a r e V e r s i o n : SONiC .HEAD.547 −4754 b43 D i s t r i b u t i o n : Debian 8 . 1 0 K e r n e l : 3.16.0 −5 −amd64 B u i l d commit : 4754 b43 B u i l d d a t e : F r i Apr 6 0 7 : 3 8 : 1 7 UTC 2018 B u i l t by : j o h n a r @ j e n k i n s −worker −4 Docker i m a g e s :

REPOSITORY TAG IMAGE ID SIZE

d o c k e r −syncd−brcm HEAD.547 −4754 b43 145 a 9 3 b f 2 6 1 3 3 5 8 . 1 MB d o c k e r −syncd−brcm l a t e s t 145 a 9 3 b f 2 6 1 3 3 5 8 . 1 MB d o c k e r −o r c h a g e n t −brcm HEAD.547 −4754 b43 32 f d a b 0 a 0 a 8 5 287 MB d o c k e r −o r c h a g e n t −brcm l a t e s t 32 f d a b 0 a 0 a 8 5 287 MB d o c k e r −l l d p −s v 2 HEAD.547 −4754 b43 9 c e 7 d c 0 f 5 5 f 6 2 9 7 . 2 MB d o c k e r −l l d p −s v 2 l a t e s t 9 c e 7 d c 0 f 5 5 f 6 2 9 7 . 2 MB d o c k e r −dhcp−r e l a y HEAD.547 −4754 b43 17 f d 0 0 c d 2 0 9 1 2 8 0 . 1 MB d o c k e r −dhcp−r e l a y l a t e s t 17 f d 0 0 c d 2 0 9 1 2 8 0 . 1 MB d o c k e r −d a t a b a s e HEAD.547 −4754 b43 5 a f 5 2 a 0 3 8 b a f 2 7 8 . 8 MB d o c k e r −d a t a b a s e l a t e s t 5 a f 5 2 a 0 3 8 b a f 2 7 8 . 8 MB

d o c k e r −teamd HEAD.547 −4754 b43 24 d8 a028 73a7 2 8 4 . 1 MB

d o c k e r −teamd l a t e s t 24 d8 a028 73a7 2 8 4 . 1 MB

d o c k e r −snmp−s v 2 HEAD.547 −4754 b43 129 e 2 b 9 6 b 2 c 4 3 1 9 . 3 MB d o c k e r −snmp−s v 2 l a t e s t 129 e 2 b 9 6 b 2 c 4 3 1 9 . 3 MB d o c k e r −r o u t e r −a d v e r t i s e r HEAD.547 −4754 b43 58 b e 6 1 6 f c d b 1 2 7 6 . 4 MB d o c k e r −r o u t e r −a d v e r t i s e r l a t e s t 58 b e 6 1 6 f c d b 1 2 7 6 . 4 MB d o c k e r −p l a t f o r m −m o n i t o r HEAD.547 −4754 b43 6 b68d76ae87b 2 9 8 . 3 MB d o c k e r −p l a t f o r m −m o n i t o r l a t e s t 6 b68d76ae87b 2 9 8 . 3 MB d o c k e r −fpm−quagga HEAD.547 −4754 b43 0 f f 2 1 6 0 2 1 f e a 2 9 0 . 6 MB d o c k e r −fpm−quagga l a t e s t 0 f f 2 1 6 0 2 1 f e a 2 9 0 . 6 MB

(44)

B2: Vagrant version

Listing 10.5 specifies the version of Vagrant used in the experiments.

r o o t @ h o s t a : ˜ / v a g r a n t f i l e s# v a g r a n t v e r s i o n I n s t a l l e d V e r s i o n : 2 . 0 . 4 L a t e s t V e r s i o n : 2 . 1 . 1 r o o t @ h o s t b : ˜ / v a g r a n t f i l e s# v a g r a n t v e r s i o n I n s t a l l e d V e r s i o n : 2 . 0 . 4 L a t e s t V e r s i o n : 2 . 1 . 1

Listing 10.5: Vagrant version used on servers A and B.

B3: VirtualBox version

Listing 10.6 specifies the version of VirtualBox used in the experiments.

r o o t @ h o s t a : ˜ / v a g r a n t f i l e s# vboxmanage −−v e r s i o n 5 . 2 . 1 0 r 1 2 2 0 8 8

r o o t @ h o s t b : ˜ / v a g r a n t f i l e s# vboxmanage −−v e r s i o n 5 . 2 . 1 0 D e b i a n r 1 2 1 8 0 6

Referenties

GERELATEERDE DOCUMENTEN

Generation X have a significant positive moderating effect on the relationship between Interactive Control Systems and more autonomous types of motivation compared to

All recent studies used regression analyses which showed that physician density has beneficial effects on several measures of health including infant and under-five

Deze energie kan (op een ongewenste wijze) in aanraking komen met dode of levende structuur (materiaal, respectievelijk mensen) (fase 4).. We spreken dan van een

Seriti, W.L., Legislation affecting the removal of black people from certain areas from the perspective of ownership and occupation of land by black people in South

Furthermore, for long term (4 to 6 days ahead) maximum temperature prediction, black-box models are able to outperform Weather Underground in most cases.. For short term (1 to 3

6) De snijpunten van deze loodlijn en de cirkel (M,MC) zijn mogelijke punten

The results show that the cultural variables, power distance, assertiveness, in-group collectivism and uncertainty avoidance do not have a significant effect on the richness of the

5.5.1 The use of online channels in different stages of the customer purchase journey In order to test the first hypothesis multiple logit models are tested with a channel as a