• No results found

Multi-party data delivery with Switch Abstraction Interface (SAI)

N/A
N/A
Protected

Academic year: 2021

Share "Multi-party data delivery with Switch Abstraction Interface (SAI)"

Copied!
34
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Bachelor Informatica

Multi-party data delivery with

Switch Abstraction Interface

(SAI)

Rutger Beltman

June 17, 2019

Supervisor(s): Lukasz Makowski M. Sc., UvA Chris Broekema M. Sc., ASTRON Dr. Paola Grosso, UvA

Inf

orma

tica

Universiteit

v

an

Ams

terd

am

(2)
(3)

Abstract

In this research we examine Azure SONiC and SAI for use in the Square Kilometer Array. Azure SONiC is an open source Network Operating System that is designed for use with switches from different manufacturers. SAI is meant as an interface between the switch hardware and a NOS like SONiC. This interface allows the same NOS to talk to different vendor ASICs. Because of the complex workflows in SKA, we need to create a system that allows for many-to-many communication. In this research we will examine two ways to do this: Unicast data streaming, where every receiver gets a targeted packet, and multicast data streaming, where packets are sent using multicasting as its underlying system. We examined these features on switches from Arista and Mellanox and we identified that key features to make these work are missing. This problem lies with the partial implementation of SAI by the manufacturers.

(4)
(5)

Contents

1 Introduction 9

1.1 Research question . . . 10

1.2 Thesis structure . . . 10

2 Square Kilometer Array 11 2.1 Square Kilometer Array . . . 11

2.2 Publish-subscribe system . . . 12

2.2.1 Unicast stream delivery . . . 12

2.2.2 Multicast stream delivery . . . 12

2.3 Related work . . . 13

3 Azure SONiC 15 3.1 Switch Abstraction Interface . . . 15

3.2 NOS design . . . 16

3.2.1 Example of services . . . 17

4 Methods 19 4.1 Unicast stream delivery . . . 19

4.1.1 Dropping . . . 19

4.1.2 Redirection . . . 19

4.1.3 Duplication . . . 20

4.2 Multicast stream delivery . . . 22

5 Experiments 23 5.1 Lab setup . . . 23

5.2 Unicast stream delivery . . . 25

5.2.1 Dropping . . . 26

5.2.2 Redirection . . . 26

5.2.3 Duplication . . . 27

5.3 Multicast stream delivery . . . 27

6 Discussion 29 6.1 Unicast stream delivery . . . 29

6.2 Multicast stream delivery . . . 29

6.3 General discussion . . . 29

6.4 Practical implementation . . . 30

6.5 Future work . . . 30

(6)
(7)

Glossary

ACL Access Control List.

API Aplication programming Interface. ASIC Application Specific Integrated Circuit.

FDB Forwarding Database.

IP Internet Protocol.

l2 Layer 2. l3 Layer 3.

MAC media access control.

NOS Network Operating System.

SAI Switch Abstraction Interface. SDK Source Development Kit. SKA Square Kilometer Array.

SONiC Software for Open Networking in the Cloud. SWSS Switch State Service.

(8)
(9)

CHAPTER 1

Introduction

ASTRON, the Netherlands Institute for Radio Astronomy1, is currently assisting in the

devel-opment of a new software telescope with the Square Kilometer Array[1]. With a large number of antennas that can collectively send petabits per second [2], it is important to efficiently transport the data from the telescopes to the processors. This task becomes more difficult because the multiple processors are allowed to request data from the same telescopes[1].

Creating a publish-subscribe system could be a possible way of allowing this data to be send to the hosts that need to compute it. A publish-subscribe system is a system where receiving parties can subscribe to data that a publishing party has to offer[1]. There are two different ways to deliver packets for this publish-subscribe system.

The first would be using unicast stream delivery. This would mean that the switch gets one packet on the input and outputs multiple packets with the destination IP and MAC address set to the receivers.

Multicasting, on the other hand, would use an IP address from the multicast address range. These addresses are not destined to a host. A receiver can set this address on an interface and then receive traffic destined to this address. This allows the switch to send data to the receiving hosts in the the multicast group. The switch then needs to be configured to the data to the receiver. After the configuration the receiver will receive packets sent to the multicast address. We will examine Azure SONiC and test if it supports features that support this publish-subscribe system. Azure SONiC is a platform designed by Microsoft to create a platform-independent open-source NOS[3]. One important component that makes this platform independent is the SAI [4]. To make networking within the switch happen manufacturers make use of an ASIC. These ASICs are created by companies like Broadcom and Mellanox. The manufacturer then develops a SDK that allows a programmer to implement functionality for the switch. The problem with this is that there is not a lot of coordination between manufacturers. This causes the SDKs to be different between manfacturers and makes it more difficult to port the the interface to the ASIC. The goal of SAI is to make an open source interface that allows the same interface with the ASIC to be run on ASICs from different manufacturers. It will then be up to the manufacturers to create an implementation of SAI which transforms SAI calls to their own proprietary SDK[4]. In this research, we will examine both Azure SONiC and SAI and test their functionality to create a publish-subscribe system.

(10)

1.1

Research question

To test for the feasibility of a publish-subscribe system we will ask the following question: Is a platform like Azure SONiC suitable to support a publish-subscribe system for the SKA use case? To determine this we will set up two sub-questions to test for this functionality. The first question looks at unicast data streaming and the second question looks at multicast data streaming.

1. Is it possible to realize unicast data streaming with SAI and SONiC?

2. Is it possible to realize multicast data streaming with SAI and SONiC?

1.2

Thesis structure

1. Introduction

In the introduction, we will give an overview of the problem and define the research ques-tions.

2. Square Kilometer Array

In this chapter, we will discuss what SKA is and how a publish-subscribe system would aid in solving the networking challenges they face with. We will also explain the background of the two possible methods that could be used to base this system on: Unicast stream delivery and multicast stream delivery. Finally, we will look at a study that tried to accomplish the same for SKA but they found support to be lacking.

3. Azure SONiC

Azure SONiC is a NOS created by Microsoft. We will examine the design of this NOS and explore the path a configuration rule has to follow to get programmed into the ASIC.

4. Methods

The data stream delivery will be based on SAI (section 3.1). We will examine ways to make unicast and multicast stream delivery work on SONiC.

5. Experiments

After describing the possible ways to create the data stream delivery system we will try to implement them by using the SAI API. We will implement the SAI calls in SONiC and test if the manufacturers support the API calls.

6. Discussion

In the discussion, we will reflect on the results we obtained from the expiriments.

7. Conclusions

With the conclusions, we look at the two methods and make a decision on which one is preferable to work on. We will also give possible explanations why features did or did not work. We will also conclude if unicast or multicast data streaming should be chosen if as a basis for the networking challenges for the SKA use case. We will also make a comparison between switches from Broadcom and Mellanox and conclude which one would offers better support for a stream delivery system.

(11)

CHAPTER 2

Square Kilometer Array

2.1

Square Kilometer Array

The Square Kilometer Array is a radio telescope currently in the development stage. It will be the largest of its kind in the world[5]. SKA consists of two different arrays. SKA1-Low will consist out of more than 100.000 low-frequency dipoles and will be built in Australia[6]. SKA1-Mid, on the other hand, will consist of 200 dishes and will be placed in South Africa[7]. SKA1 will only be the first phase and both sites are planned to expand later in the future during the second phase[5].

co n ce n tra to r S D N c a p ab le s w itc h S w itc h Batch processing Batch processing Batch processing Batch processing Batch processing WAN (regional science centres) Mirror Science archive Science Data Processor

X X X X X Ingest Ingest Ingest Ingest Ingest Compute Island Medium performance storage Long term storage

Figure 2.1: Overview of the Science Data Processor used in the SKA telescope. Image originally from [8], used with permission by Chris Broekema.

Figure 2.1 shows an overview of what the network would look like within the Square Kilometer Array. The concentrator takes the data streams from multiple telescopes and bulks it together into concentrated UDP streams[1]. The concentrator sends these streams towards the ”SDN capable switch”. This switch is responsible for getting data from the telescopes to the compute islands that request it[8].

An important factor with software telescopes is that there exists a many-to-many relationship between the telescopes and the compute islands. This means that multiple telescopes may need streams from the same telescopes. Even though the telescopes can generate UDP packets, they do not have a full network stack integrated. This means that the telescopes are ”dumb” and will just output the packet with a statically configured destination IP/MAC address in its header. To solve this problem, it would be advantageous to move the burden of deciding to give a correct destination from the telescopes to the network itself[1].

(12)

2.2

Publish-subscribe system

One proposed way of making this communication happen is by creating a publish-subscribe system[1]. In such a system we make a separation between producers and consumers. The pro-ducers send data streams to which the consumers can subscribe. A network controller will then notify the SDN capable switch that this subscriber wants to receive the stream as well. With this publish-subscribe system, we will have three different cases.

1. When there are no receivers subscribed the packet should be dropped. The default behavior, when a MAC address is not in the MAC address table, is to redirect traffic to all devices that are connected. This behavior is unwanted as it would flood the network with unused data, what we want instead is the traffic to be dropped.

2. Whenever a single receiver is subscribed to a stream, a packet should be exclusively sent to the host that is subscribed.

3. The last and most difficult case happens when multiple receivers are subscribed to the same stream. In this case, the switch has to forward the packet to all subscribed receivers.

To achieve the publish-subscribe system two methods could be used. With unicast stream delivery, the switch will send a unicast packet to all of the subscribed hosts. This means that every packet has to get its IP and MAC address changed to the IP and MAC address of the subscribed hosts. With multicast stream delivery, we use multicasting to get the packet to all of the subscribed hosts. In the following section we will explain unicast and multicast stream delivery in more detail.

2.2.1

Unicast stream delivery

With unicast routing, we make use of the unicast address space to get a packet to its destination. This means that the host has an IP and MAC address to which we send the packet. The switch will then send the packet to its destination. Unicast stream delivery is different. The publishing host sends packets with statically configured header fields to the switch. The IP and MAC addresses in the header fields will then be replaced with the destination of the subscribed host. If there are multiple hosts subscribed, the packet needs to be duplicated in addition to the modifications to the header fields[9]. The main advantage of this system is that it does not matter if the network between the switch and the subscribed hosts support it because it is based on unicast routing.

2.2.2

Multicast stream delivery

Multicasting allows a router to take a single source node and send copies to a subset of available network nodes[10]. The problem with multicasting is with figuring out which destinations are subscribed to the packet. With unicast routing, the destination IP and MAC are located in the header, but with multicast routing the destination IP and MAC are part of the multicast address space. This means there is no information in the header fields that contains the subscribers[11]. Multicasting works by creating multicast groups to which hosts can subscribe to. A class D (multicast) IP address is then attached to the group. When a packet with this destination IP address arrives at the switch the packet is forwarded towards all members of the multicast group. All multicast addresses lie within the subnet 224.0.0.0/4. This means that there are 228 possible IP addresses.

(13)

A multicast MAC address is composed of two parts. The first part is always the constant prefix 01:00:5e. The last 23 bits of the MAC address is derived from the last 23 bits of the IP address[12]. Combining these two parts gets us the Multicast MAC address. This means that there are only 223 unique MAC addresses. With the number of possible MAC addresses being lower than the

amount of IP addresses, we can see that the same MAC address can be derived from multiple IP addresses. For example: The IP addresses 224.43.73.123 and 239.43.73.123 both map to the MAC address 01:00:5e:2b:49:7b. While this does not break multicasting, this should be taken into account.

2.3

Related work

In 2016 a similar research was conducted with cooperation of the UvA and ASTRON [9][13]. This re In this research he attempted to make the publish-subscribe system like described above, but with the help of OpenFlow. OpenFlow was chosen for a couple of reasons. One big advantage is that it was seen as the most common implementations of Software Defined Networking. The two main features examined in this paper were traffic redirection and traffic duplication. Traffic dropping was not included, because the default behavior already is to drop unknown traffic. With traffic redirection, it was discovered that the modification of both the MAC address and IP address were supported, but the IP address modification was handled by the CPU. This meant that the IP address modification feature had a large performance penalty and it was therefore decided to drop the IP modification and only focus on the MAC address modification. With the duplication of traffic, problems were also encountered. The researcher made a theoretical implementation to send the packet to multiple destinations, but he discovered that the actual implementations of OpenFlow in switches were not able to perform this. He decided to work around this issue by keeping the IP address static and set the destination MAC address to the broadcasting address FF:FF:FF:FF:FF:FF to make sure the packet exclusively goes to the correct destinations he uses explicit forwarding. Explicit forwarding means we state the output ports and the packet will be forwarded towards these ports. With this forwarding method, you do not use the MAC address. Analyzing this research we can see that he tried to build unicast stream delivery.

(14)
(15)

CHAPTER 3

Azure SONiC

It used to be the case with traditional networking equipment that a customer would be locked in with the software that the manufacturer provided. The motivation of white-label switches is to break the hardware layer and software layer apart. It does this by allowing the user to install any Network Operating System of choice on top of the switch. This gives more power to network engineers because the choice of operating system gives more customization to the specific needs. It also allows the user to consistently use the same NOS across switches of different manufacturers on the same network[14]. Azure SONiC is an effort from Microsoft to create an open source NOS aimed at cloud data centers[3]. Azure SONiC aims to build a platform that can run on multi-vendor switches and in a modular way where the user can update part of the switch without downtime. The ability to have SONiC run on switches from multiple vendors allows the user to easily transfer the same configuration file to a switch from a different manufacturer. In this chapter we will talk about what white-label switches are, what SAI does and how SONiC loads configuration rules into the ASIC.

3.1

Switch Abstraction Interface

The Switch Abstraction interface was created as an effort to create a vendor-independent C API for programmers of ASIC to interact with. This was done because before SAI there was not a common and well-understood way to program the ASIC. The Open Compute Project releases header files for the C API. As discussed in the introduction manufacturers will then have to implement this API and make translations for SAI calls to the SDK. The SAI API allows pro-grammers to make their code run on all switches that support it. The SAI project is organized by the Open Compute Project with collaboration from companies like Microsoft, Broadcom, Mellanox and Barefoot[4] . All parties collaborate to make the SAI project happen through proposals on the SAI github1. The full coverage of the API can also be found here.

We will look at two manufacturers that support SAI are Broadcom and Mellanox.

Broadcom created the SAI implementation on top of OpenNSL2. OpenNSL was designed on top

of the closed source SDK of Broadcom as a way to provide developer access to part of the SDK without releasing the full SDK. The SAI implementation on the Broadcom platform is released as Debian package files which means unimplemented features cannot be manually added to the SAI implementation3.

Mellanox uses its own custom SDK: SwitchX. The header files of SwitchX together with the documentation is provided via Github4. The C code and header files of the implementation of

SAI is fully open source and also available on GitHub5.

1https://github.com/opencomputeproject/SAI/blob/master/inc/

2https://broadcom-switch.github.io/OpenNSL/doc/html/OPENNSL_OVERVIEW.html

3https://github.com/Broadcom-Switch/SAI

4https://github.com/Mellanox/SwitchRouterSDK-interfaces/tree/master/SDK/source

(16)

3.2

NOS design

SONiC works with the help of multiple different services that each have a job dedicated to them. This separation allows individual components of the switch to be updated while the other component keeps functioning without downtime. An overview of all components in SONiC can be found in figure 3.1.

Figure 3.1: Overview of the Azure SONiC Network Operating System[3]

The three most important parts in SONiC are the SWSS, database and syncd services.

• The database is central to the entire operating system. It contains a redis database that works in key-value manner. A part of the job of the database is to share information about the configuration to the SWSS service and to store the SAI calls before they are put into the ASIC.

• The SWSS is a critical part of Azure SONiC that makes all of the parts work together. The SWSS service does this by running an orchestration agent for each of the services available to the operating system. For example: an orchestrator that manages the FDB.

• The final service of importance to us is syncd. Syncd provides a mechanism to synchronize the hardware of the switch with the SAI calls generated by the orchestrators in the SWSS service. Syncd has two jobs. The first job is to take a rule from the ASIC database and call the appropriate API function from the SAI API. The second role of syncd is to take state changes from the ASIC and push them to the SONiC databases. One example would be updating counters which contain the number of bytes that went through a port.

(17)

3.2.1

Example of services

To describe the path configuration rules have to take we will give an example of how a configura-tion rule gets transformed into the ASIC rule. This path is important to understand, because we will need to understand all parts of this path to be able to make modifications and add function-ality that is not implemented by default. To explain this path, we will use the transformation of an ACL entry as an example. The simplest way to explain ACL tables is as a stateless filter on network traffic to control the movement of data. The first part of an ACL is the table. An ACL table is connected to physical ports of the switch. ACL rules can then be attached to these tables to make an action happen for certain traffic. ACL rules have a wide variety of actions they can support. For example: changing different fields in the header of a packet or redirecting a packet to another destination.

Figure 3.2: The path an ACL rule has to take before being put into the ASIC.

In figure 3.2 we have a visual representation of the path and in the following list we will give an explanation for each step:

1. The first step of our process is to define an ACL rule to be added to the database. We will take the rule from listing 3.1 and put it into the database.

1 "ACL_RULE|DATAACL|RULE_9"

2 "PRIORITY": "9991"

3 "PACKET_ACTION": "DROP"

4 "SRC_IP": "192.168.1.2/32"

Listing 3.1: The configuration rule for packet dropping

This rule drops all traffic that comes from the IP address ”192.168.1.2” and is attached to an table called ”DATAACL”.

2. The SWSS service gets a signal that a change is made in the configuration database. This causes the ACL rule to be retrieved from the database.

(18)

3. The configuration rule gets transformed into the SAI calls in listing 3.2 1 "ASIC_STATE:SAI_OBJECT_TYPE_ACL_ENTRY:oid:0x8000000000413" 2 "SAI_ACL_ENTRY_ATTR_TABLE_ID":"oid:0x70000000003f6" 3 "SAI_ACL_ENTRY_ATTR_PRIORITY:"9991" 4 "SAI_ACL_ENTRY_ATTR_ADMIN_STATE":"true" 5 "SAI_ACL_ENTRY_ATTR_ACTION_COUNTER":"oid:0x9000000000412" 6 "SAI_ACL_ENTRY_ATTR_FIELD_SRC_IP":"192.168.1.2&mask:255.255.255.255" 7 "SAI_ACL_ENTRY_ATTR_ACTION_PACKET_ACTION":"SAI_PACKET_ACTION_DROP"

Listing 3.2: The configuration rule transformed to SAI

The configuration has a lot of similarities to the SAI entries, but the SAI database entries are much more explicit, for example, the table the rule is a part of is represented by an object id, instead of a reference to the name. The orchestrator keeps track of all of the ob-jects in the database and makes sure they are linked correctly to the corresponding name. This transformed rule is put serialized and then put in the ASIC database. This change triggers the syncd service to come into action and process the new database entry.

4. The syncd service gets a signal that a new entry is put into the ASIC database and fetches it.

5. The syncd service deserializes the database entry and makes all of the corresponding calls from the SAI API. These calls are transformed by SAI into corresponding calls the switch its SDK.

(19)

CHAPTER 4

Methods

In this chapter, we will look into possible functions within the SAI header that can aid us with creating the publish-subscribe system. We will discuss possible solutions for two different methods. With unicast stream delivery, we take a stream from one of the telescopes and forward it to all subscribed compute islands. Multicast stream delivery will take a packet and forward it to the ports which are a member of the multicast group. The destination MAC and IP address of these unicast packets will be modified to reflect the IP and MAC address of the receiving telescopes. The modified packets will then be forwarded to the correct port.

4.1

Unicast stream delivery

Like explained in section 2.2.1 there are three separate cases with unicast stream delivery. When no hosts are subscribed to the stream of the telescope, the traffic generated should be dropped. When only one host is subscribed, the destination IP and MAC should be modified to the IP and MAC address of the subscribed compute islands. When multiple hosts are subscribed, we will need to make a duplicated packet for each receiving island and redirect the packet to it. For all three cases we will look into the SAI header files to find possible ways of achieving this.

4.1.1

Dropping

The way of dropping traffic we have chosen is by using Access Control Lists. ACLs have a lot of different filters we could manipulate the data on. We would drop the traffic based on the destination IP, because packets that have no destination should be dropped. Another filter we will add is acting on UDP packets. This will make sure any non UDP packets will continue to be sent around the network with default behaviour.

4.1.2

Redirection

The way we would deploy redirection rules would be similar to dropping traffic. We create two filters to match a packet that should be redirected, the first filter is to match the destination IP and a second filter for the UDP protocol. We have two ACL actions that support the rewriting of MAC and IP addresses. The first is SAI ACL ACTION TYPE SET DST IP to rewrite the destination IP address. The second is SAI ACL ACTION TYPE SET DST MAC to rewrite the destination MAC address. In the first method we will try to use l3 routing and change only the IP. In the second method we will create a solution where both the MAC and IP address are changed. The third solution is a little bit more tricky. We will use an external cable that loops back to the switch[15]. With this loop back method we connect two ports on the switch together. When a packet is send from one these ports it arrives at the other port. This way we will be able to apply the two ACL actions to the packet.

(20)

Redirection with IP rewrite and routing (redirection method 1)

In the first redirection method we use a routing configuration for the switch as described in figure 5.2. We will create an ingress ACL rule with the action SAI ACL ACTION TYPE SET DST IP to modify the packet its destination IP address. Since we use routing the MAC address will be correctly set to the MAC address of the destination.

Redirection with two actions (redirection method 2)

The simplest would be to create either one ACL rule that contains two actions:

SAI ACL ACTION TYPE SET DST IPand SAI ACL ACTION TYPE SET DST MAC. The big ad-vantage of this solution is that we are able to change both the IP and MAC address in one cycle through the ASIC. The disadvantage is that this solution is unlikely to work, because usually ACL tables only allow one action per entry. The way we could work around this is by creating one rule on when the packet enters the switch and a second rule when a packet leaves the egress port. Even though this solution is unlikely to work it is still worth investigating. For this solution we will use the VLAN topology in figure 5.3.

Redirection with two cycles

sender receiver 1 receiver 2 receiver 3 1 3 switch 2 4 5 6

Figure 4.1: The flow of the packet on a rewrite. Port 3 changes the IP address and port 1 changes the MAC address.

If it is not possible to get both rules in one cycle we will split the ACL rule in two and make the packet go through the switch twice. In the first round through the switch we will change the IP address with SAI ACL ACTION TYPE SET DST IP. In the second round the MAC address will be redirected with SAI ACL ACTION TYPE SET DST MAC. To make the packet to through the switch two times we will use an external loopback cable. This cable will make sure that any packets that go through it arrive back at the switch. The following example will give an overview of how the packet would be redirected to any receiver in figure 4.1.

1. Packet arrives at port 3.

2. Packet IP is rewritten to IP of the receiver 3.

3. Packet leaves from port 2.

4. Packet loops back and arrives at port 1.

5. Packet MAC is rewritten to MAC of receiver 3.

6. Packet leaves from port 6.

(21)

duplicate with redirect action to multicast group (duplication method 1)

There is one last possible way which seems to be supported by the SAI header files. The Access Control List SAI headers support many types of redirect targets. Among these are multicast groups. It might be possible to create an ACL rule with a redirect target of either a l2 or l3 multicast group. The packet would then be redirected to each member of this multicast group. Further explanation of the multicast groups can be found within section 4.2.

Duplication with RSPAN (duplication method 2)

sender receiver 1 receiver 2 receiver 3 1 3 2 5 switch 4 6 7 8

Figure 4.2: Connections on switch to allow for looping

RSPAN is a mirroring protocol where the packet is encapsulated and send on its way a remote host that can analyze the packet[16]. We will use this protocol to create a duplication of the packet and set the destination IP/MAC of the packet to that of the correct receiver.

This method will consist of two different parts. The first part is to create an ACL rule that mirrors with RSPAN towards one of the receivers. The second packet is to alter the original packet by reducing the time-to-live TTL field of the packet. This way we can use a different ACL rule and mirror the packet to a different receiver. We will give an example of how this would work when a packet has to be send to receiver 1 and receiver 2 in figure 4.2.

1. A packet arrives at port 5 with a ttl of 200.

2. An ACL rule that matches TTL 200 mirrors the packet towards receiver 1.

3. The original packet departs from port 4 and arrives at port 3.

4. The TTL of the packet is reduced by one.

5. The packet departs from port 2 and arrives at port 1.

6. An ACL rule that matches TTL 199 mirrors the packet towards receiver 2.

7. The original packet departs from port 4 and arrives at port 3.

8. The TTL of the packet is reduced by one.

9. The packet departs from port 2 and arrives at port 1.

(22)

Duplication with SPAN (duplication method 3) sender receiver 1 receiver 2 receiver 3 1 3 2 5 switch 4 6 7 8

Figure 4.3: Connections on switch to allow for looping

With the same we use looping method as with RSPAN, we could also use SPAN[16]. SPAN allows us to mirror a packet to a physicial port on the switch. With this method you gain an exact copy of the packet. For this reason we will combine the redirection method together with SPAN to duplicate the packets towards all of the hosts. For this method we assume the IP and MAC address can be written in one cycle through the switch. We also assume that the original packet is targeted correctly to the first receiver. We will give an example of how this would work when a packet has to be send to receiver 1 and receiver 2 in figure 4.2.

We assume the original packet already has its destination set to receiver 1.

1. A packet arrives at port 5

2. An ACL rule mirrors a copy of the packet that departs from port 4. The original packet will be send to receiver 1.

3. The mirrored packet arrives at port 3.

4. The mirrored packet its IP and MAC address are modified for receiver 2.

5. The packet departs from port 2 and arrives at port 1.

6. The packet is send to receiver 2 and the packet is not mirrored, because receiver 2 is the last receiver to receive a packet.

4.2

Multicast stream delivery

Multicasting is in its design already created to allow one host to send data to multiple destinations[11]. The SAI header files show three different methods to implement multicasting: IP multicasting, l2 multicasting and the multicast forwarding database.

With IP multicasting we use an IP address in the range 224.0.0.0/4 and create a multicast entry with this IP as its destination IP and the telescope’s IP as the source IP. We then create a multicast group with the ports to the receiving compute islands. By managing this group we can accomplish all of the tasks of unicast stream delivery. The great thing great about multicasting, is that dropping, redirection and duplication are all included by default.

(23)

CHAPTER 5

Experiments

To discover if the Azure SONiC and SAI support the methods described in chapter 4, we made use of multiple ways of finding this out. The first technique was to write an actual implementation. We also looked into the implementation of the SAI API. Unicast stream delivery first was tested first. We tested features for dropping, redirection and duplication as described in chapter . After this, we looked at the support for multicast stream delivery.

5.1

Lab setup

To test our publish-subscribe methods we made use of two different switches. The first is the ”Arista 7050QX” and the second switch is the ”Mellanox SN2010”. Table 5.1 describes the SONiC version and hardware specification of both switches.

Arista Mellanox

Switch Arista 7050QX Mellanox SN2010 Chipset Broadcom Trident 2 Mellanox Spectrum SONiC software version SONiC.201807.0-dirty-

SONiC.HEAD.0-dirty-20181105.042258 20190517.105532 Distribution Debian 8.11 Debian 9.9 Kernel version 3.16.0-5-amd64 4.9.0-8-2-amd64 Build commit 56b8a2c 0a6dd88

Table 5.1: Hardware and software specification

Figure 5.1 show the physical topology we used for our experiments. The switch management cable is used to manage the switch remotely via SSH. The data lines are used for the action switching. The first wire from server 1 is sending data to the switch. The other three wires are used to test if the switch is able to send data to the correct hosts.

(24)

server 2 server 1 sonic switch data lines data lines switch management

Figure 5.1: Physical topology of the test setup.

Network switch

.221.1/24

.222.1/24

.223.1/24

server 1

server 2

.220.1/24

.223.10/24

.222.10/24

.221.10/24

.220.10/24

(25)

Network switch

.1.2/24

.1.3/24

.1.4/24

.1.5/24

server 2

server 1

VLAN 200

Figure 5.3: Topology with IP addresses for the setup simulating a l2 network.

5.2

Unicast stream delivery

We evaluated the SAI API for the unicast stream delivery features, but unfortunately the support of the features are lacking in the SAI APIs we used. Dropping went fine, but we already ran into trouble with the redirection. For this reason we decided to drop further research on unicast stream delivery and focus on multicast stream delivery instead. The following three sections describe what we discovered with the SAI API for the unicast stream delivery features. A summary of the results can be found in table 5.2

Feature Broadcom ¡ellanox Drop with ACL ok ok

Change MAC ok ok

Change IP no no

Duplicate by redirection no no f to mc group

(26)

5.2.1

Dropping

The dropping of packets with the SAI API was not a problem. We were simply able to use the SAI PACKET ACTION DROPaction in an ACL table to drop the traffic based on the destination IP address. This rule can be found in listing 5.1 and worked on both Arista and Mellanox.

1 redis-cli -n 4 hmset "ACL_RULE|DATAACL|RULE_9" \

2 "PRIORITY" "9991" \

3 "PACKET_ACTION" "DROP" \

4 "SRC_IP" "192.168.220.0/24" \

5 "IP_PROTOCOL" "17"

Listing 5.1: Rule that made packet dropping possible

5.2.2

Redirection

The method of changing the IP address with SAI ACL ACTION TYPE SET DST IP was not implemented by both manufacturers.

For Arista we had to write an implementation of the ACL action. To test for the support of the ACL action with Arista we decided to implement the action in SONiC and see if we were able to redirect a packet or an error was generated. A modified version of the ACL orchestrator can be found on our Github1. When creating the ACL entry to modify the IP address we get an error like described in listing 5.2.

1 _brcm_sai_create_acl_entry:4592 Error processing acl attributes

2 ...

3 :- processEvent: failed to execute api: create, key:

4 SAI_OBJECT_TYPE_ACL_ENTRY:oid:0x8000000000611,

5 status: SAI_STATUS_ATTR_NOT_SUPPORTED_0

Listing 5.2: Error that shows SAI ACL ACTION TYPE SET DST IP is not supported

What we can conclude from this error is the Arista is currently not supporting the attribute.

We were able to confirm the lack of support of the ”set destination IP” for Mellanox after inspecting the SAI implementation2 and confirming it with the manufacturer. When analyzing SONiC for the support of the ”set destination MAC” action we were also running into trouble. SONiC is currently only supporting ACL tables which work on l3. This means that the MAC address is not within the scope of the ACL rule and it will therefore not be possible to modify the destination MAC address through an ACL table. The effect of this is that redirection method 1, 2 and 3 are all not possible to implement, because they all require the IP address to be changed through the ACL action.

We were able to change the MAC address through the use of the SAI ACL ACTION TYPE REDIRECT. This was done by using the next-hop of a different receiver. The following rule redirected the packet to the next-hop with the IP address of 192.168.222.10.

1 redis-cli -n 4 hmset "ACL_RULE|DATAACL|RULE_9" \

2 "PRIORITY" "9991" \

3 "PACKET_ACTION" "REDIRECT:192.168.222.10" \

4 "SRC_IP" "192.168.220.0/24" \

5 "IP_PROTOCOL" "17"

Listing 5.3: Rule that made MAC redirection possible

With the rule described in listing 5.3, the MAC address got changed, but the IP address stayed the same.

(27)

5.2.3

Duplication

For duplication method 1, we tried to create an ACL rule and redirected it towards a multicast group. Unfortunately, the multicast groups were not properly implemented in the SAI adapter of Mellanox and Arista. Evidence of this can be found in section 5.3. Duplication method 2, where we wanted to use RSPAN to duplicate the packet was not fully tested. We did look in the documentation of SONiC and they claim that port mirroring with RSPAN is natively supported. It can be considered future work to test this functionality in the Azure SONiC implementation if the looping of packets is indeed possible. Duplication method 3, where we want to use SPAN is not possible to implement. The reason for this is the lack for support of the SAI ACL ACTION TYPE SET DST IP action that we discovered in section 5.2.2. Because this action was required to make this looping method work, we decided to not investigate it further.

5.3

Multicast stream delivery

When analyzing the support for multicasting we started on Arista, we tested for the support of the multicasting APIs by making a query for the IP multicasting API. With this query came the errors in listing 5.4.

1 failed to query api SAI_API_IPMC: SAI_STATUS_NOT_IMPLEMENTED

2 failed to query api SAI_API_IPMC_GROUP: SAI_STATUS_NOT_IMPLEMENTED

3 failed to query api SAI_API_L2MC: SAI_STATUS_NOT_IMPLEMENTED

4 failed to query api SAI_API_L2MC_GROUP: SAI_STATUS_NOT_IMPLEMENTED

5 failed to query api SAI_API_MCAST_FDB: SAI_STATUS_NOT_IMPLEMENTED

Listing 5.4: Errors that show that the Arista platform does not support multicasting

These errors tell us that it will be impossible to get the multicasting working on the Arista switch. The same query on the Mellanox switch did not yield the same error, but after examining the SAI API implementation we discovered that support was lacking. In the C code of the SAI adapter, we only found ways to create L2MC groups, since we need to create L2MC or multicast FDB entries that make use of the groups it will not be possible to allow multicasting on the switch. We were also in contact with Mellanox support and they told us that they do not support multicast features in SAI. These findings lead us to the results found in table 5.3.

Feature Broadcom Mellanox IP multicast groups no no

IP multicast no no L2 multicast groups no yes L2 multicast no no Multicast FDB no no

(28)
(29)

CHAPTER 6

Discussion

In this research we attempted to make a publish-subscribe system work for the SKA use case. The enormous amount of traffic and the specific requirements for the publish-subscribe system make this a tough challenge. We looked at the support of these requirements in SAI and Azure SONiC.

6.1

Unicast stream delivery

The dropping of packets was no problem. A simple ACL rule was able to drop the stream of data from source to destination. Redirection was another story though. We were able to change the MAC address through the redirect action. Because Azure SONiC currently has no support for ACL rules on l2 we were not able to use the corresponding action for changing the MAC address. Within the SAI documentation, we found only one way to modify the destination IP of a packet and this was by using an ACL action. Unfortunately, neither Mellanox nor Broadcom appears to support this feature within the SAI adapter. For the packet duplication, we found some promising ways to do this. The redirect action from the ACL allows either l2 Multicast or IP multicast groups as destinations. Because of the lack of support for l2 ACLs, the IP multicast group has to be implemented to make duplication of packets work. We chose not to implement the duplication with port mirroring, because port mirroring only allows one destination to be duplicated to. This means we would have to loop the packet through an external loopback cable which could have performance implications.

6.2

Multicast stream delivery

We discovered that almost none of the multicast features were supported by Arista and Mellanox. Fortunately we did find code for the L2MC groups in the SAI implementation of Mellanox. The L2MC group means allow for a manual implementation of l2 multicasting. The only thing that would have to be added is an mutlicast entry that links an IP address to the L2MC group. For this reason Mellanox would be a promising platform to base a multicast data stream delivery on.

6.3

General discussion

With the lack of support of rewriting IP addresses on packets, we think that the best way to go forward is to use the multicast system. This could be done by using a controller which sets all of the multicast groups manually on all switches. It is also interesting to note that the feature to rewrite the IP address was not supported in a lot of the OpenFlow implementations as well. A possible explanation could be that both operating systems are built on top of white-box switches which already have their ASIC pipeline build for general routing and switching purposes. Since rewriting IP addresses is not something that happens within routing and switching it is not

(30)

implemented in hardware and therefore both OpenFlow and Azure SONiC will be unable to use this feature.

Examining a switch based on Broadcom and Mellanox ASICs we can see that it is currently not possible to implement multicasting on the Mellanox and Arista switch we tested, because of the lack of support by manufacturers. For Mellanox, we were able to confirm that multicasting was possible on l2, but Mellanox currently is not using these features in the SAI implementation. We think the SAI implementation from more manufacturers should be examined to see how much they are implementing when it comes to multicasting. In the ideal situation, the source code of the implementation is open source like with Mellanox. It would then be easy to discover the support by just looking at the source code of the SAI implementation of the manufacturer.

6.4

Practical implementation

As a workaround to get the multicasting working we created a special case within the syncd service that circumvented the SAI API by calling the Mellanox SwitchX APK directly. To make this possible we created three different python scripts. The first to set up the multicast groups, the second to add ports to the multicast group and a third to delete members from the multicast group. These python scripts are called from the syncd service and a call to the SAI API is prevented. With this method, we were able to prove that support for multicasting could be added. To make multicasting fully possible with SONiC three different things would have to be added. The first would be to modify the SAI API implementation to support the multicast FDB. With this support added we will also need to extend the SWSS to add an orchestrator that can transform multicast rules from the config database and create the appropriate SAI calls that will make the multicasting possible. The last that would have to be added is a new service that listens on the management port of the switch. The service will receive instructions on how to manipulate the multicast group and will modify the configuration rules within the config database.

6.5

Future work

We only tested the functionality of unicast and multicast stream delivery on two different plat-forms. In the future it is worth investigating the SAI implementation on other platforms for the features we tested for. If no manufacturers support the functionality it is also worth investigating if it is possible to extend the SAI implementations of manufacturers that release it open source.

To make the multicast stream delivery work on Azure SONiC three different components have to be added:

1. The first is a new daemon that listens on the management port to receive information on how to edit the multicast forwarding database and turn these rules into configuration rules.

2. Then we need an orchestrator which turns the multicast rules into SAI calls and put these rules into the ASIC database.

3. Finally, we need to extend syncd support all of the features for the multicast forwarding database.

We think this will work because we tested all of the individual components and it does seem like a practical implementation is possible. The hardest part would be to extend the SAI imple-mentation of Mellanox, but this would be fixed if a vendor was found which does fully support

(31)

CHAPTER 7

Conclusions

In this research, we looked at the SAI implementation of two different vendors and the function-ality they both support for the stream delivery systems. To discover this, we asked our self the question: Is a platform like Azure SONiC suitable to support a publish-subscribe system for the SKA use case?. To answer this question we divided it into two subquestions. The first question we asked about the feasibility of a system using unicast data streaming and the second question was about multicast data streaming.

Unicast data streaming will not work in the current state of SAI. The critical part of this system is to rewrite the IP address. Unfortunately, we discovered that even though this critical feature was described in SAI, neither Mellanox nor Broadcom had implemented this feature in their SAI API. For this reason, we would have to say that a publish-subscribe system using unicast data streaming is not feasible to implement in SAI and SONiC.

Multicast data streaming was also lacking support in the implementations of SAI adapters we examined. Mellanox is currently more promising to get this feature working. The missing parts of the SAI API for multicasting would have to be extended to the SAI API manually. Unfortu-nately, this means we lose the platform independence of SAI and the SAI API is the responsibility of the manufacturer.

Having tested support for these two methods, we can conclude that support for a data stream delivery system in the SAI implementation we have tested is insufficient. For the two methods we examined we found no working implementation of the SAI API that could be the basis of a publish-subscribe. However, we did proof that the basis of a l2 multicast implementation is available. We were able to work around the lack of support for multicasting by programming the SDK directly from SONiC. We think that an implementation could be created for multicasting by extending the SAI API implementation.

(32)
(33)

Bibliography

[1] P. C. Broekema, “Improving sensor network robustness and flexibility using software-defined networks,” SKA SDP Memo, 2015.

[2] “Signal transport and networks.” https://www.skatelescope.org/signal-proce ssing/. Accessed: 8-06-2019.

[3] “Sonic: The networking switch software that powers the microsoft global cloud.” https://azure.microsoft.com/nl-nl/blog/sonic-the-networking-switc h-software-that-powers-the-microsoft-global-cloud/. Accessed: 03-04-2019.

[4] “Switch Abstraction Interface (SAI) officially accepted by the Open Compute Project (OCP).” https://azure.microsoft.com/nl-nl/blog/switch-abstraction-i nterface-sai-officially-accepted-by-the-open-compute-project-ocp/. Accessed: 08-06-2019.

[5] “SKA Project.” https://netherlands.skatelescope.org/ska-project/. Ac-cessed: 08-06-2019.

[6] “SKA Australia.” https://www.skatelescope.org/australia/. Accessed: 08-06-2019.

[7] “SKA Africa.” https://www.skatelescope.org/africa/. Accessed: 08-06-2019.

[8] P. C. Broekema, R. V. van Nieuwpoort, and H. E. Bal, “The square kilometre array science data processor. preliminary compute platform design,” Journal of Instrumentation, vol. 10, no. 07, p. C07004, 2015.

[9] D. Twelker, “On the feasibility of softwaredefined networking in the squarekilometre array science data processor,” UvA Scripties Online, 2016.

[10] “Overview of the internet multicast routing architecture.” https://tools.ietf.org/h tml/rfc5110. Accessed: 17-06-2019.

[11] J. F. Kurose and K. W. Ross, Computer networking a top-down approach. Pearson, 6th edition ed.

[12] “Ethernet multicast MAC addresses.” http://h22208.www2.hpe.com/eginfoli b/networking/docs/switches/5130ei/5200-3944_ip-multi_cg/content/ 483573739.htm. Accessed: 08-06-2019.

[13] P. C. Broekema, D. R. Twelker, D. C. Rom˜ao, P. Grosso, R. V. van Nieuwpoort, and H. E. Bal, “Software-defined networks in large-scale radio telescopes,” in Proceedings of the Computing Frontiers Conference, pp. 263–266, ACM, 2017.

[14] L. Makowski and P. Grosso, “White-label open-source networking.” https: //wiki.surfnet.nl/display/SURFnetnetwerkWiki/Project%3A+White+L abel+Switches?preview=/11211092/11211318/UvA-white-label-switching -RoN-2017.pdf. Accessed: 27-05-2019.

(34)

[15] “Vxlan routing.” https://docs.cumulusnetworks.com/display/DOCS/VXLAN+ Routing. Accessed: 09-06-2019.

[16] “Understanding SPAN,RSPAN,and ERSPAN.” https://community.cisco.com/ t5/networking-documents/understanding-span-rspan-and-erspan/ta-p /3144951. Accessed: 09-06-2019.

Referenties

GERELATEERDE DOCUMENTEN

In order to obtain deeper information of the dynamic of the photochemical cyclization and cyc- loreversion of the dithienylethene switch in Figure 2, the AIMD in the first

The study uses the customer satisfaction survey from a financial service firm to create the switching cost measure and subsequently study the effect of the

At first, we provide background information on Intrusion Detec- tion System in order to understand detection process and explain different types of techniques used for mitigation

U bent tevens van harte welkom op de receptie die na afloop van de promotie van 12.15-13.00 uur in het Academiegebouw

This figure does not show which trend initiated the consumerization of IT, but it does show that technological inventions (virtualization, cloud computing, device diversity)

The Tayloe Mixer is alike a sampling mixer with 25% duty cycle, but suppresses the noise folding by introducing an RC filter with cutoff frequency lower then the sampling

• Separatie is het geheel van maatregelen dat genomen wordt om een jongere te verwijderen van de openbare ruimte op een afdeling en onder te brengen in zijn kamer of in een

The contrast between inalienable possession expressed by phrases headed by relational nouns and alienable possession expressed by phrases headed by non-relational nouns