Partitioning of Big Graphs

(1)

Bachelor Informatica

Partitioning of Big Graphs

Youp Uylings

August 4, 2015

Supervisor(s): Merijn Verstraaten

Inf

orma

tica

—

Universiteit

v

an

Ams

terd

am

(2)

(3)

Abstract

Graphs are a versatile data structures, and as such are used in quite a variety of fields like net-work analysis, netnet-work science and graph theory. Over the last decade, the size of graphs used by big companies such as Google, Facebook and Amazone, has increased continuously. Many graphs are now too big to fit into memory. To be able to load the graph into memory and use the graph efficiently, the graph needs to be partitioned. However, partitioning large graphs is computationally intensive. This is why faster partitioning algorithms are required.

Most of the common partitioning methods used (on all types of graphs) are quite complex. The literature in the field of graph partitioning is only occupied with studying optimal workload distributions without considering the computation cost of partitioning [9].

This thesis focuses on five partitioning algorithms and measures both the cross partition com-munication and the workload of five different common types of large graphs, partitioned in two, four and six partitions. The five partitioning algorithms contain three complex partitioning algo-rithms (METIS(recursive), METIS K-Way and Spectral partitioning) and two, relatively simple algorithms (Node- and Edge partitioning). We measured the cross partition communication on Pagerank and Breadth First Search. In combination with the workload balance, this yields a good measurement of the efficiency of the partitioning algorithms.

Research questions that arise naturally:

• Can a sound procedure be developed to compare the efficiency of the five partitioning algorithms on various types of graphs?

• Does the difference in efficiency justify the use of less complex algorithms?

To be able to make a quick overview of the results, we made a small classification scheme that consists out of three classes.

• 1. Edge Partitioning is most efficient

• 2. Most efficient algorithm is open for speculation • 3. METIS Partitioning is most efficient

METIS is the most efficient in partitioning undirected graphs, while the less complex Edge partitioning is in most cases more efficient in partitioning directed graphs and graphs where the degree of every node is more or less in line with the average degree.

In order to ensure that the selected partitioning algorithm is the most suitable for a particular graph, we have created an overarching formula that is able to compare the efficiency of all the partitioning algorithms. When the most efficient algorithm is open for speculation, this formula is the perfect answer to be able to choose the most efficient partitioning algorithm.

(4)

(5)

Introduction

Graphs are an efficient way to represent complex data structures and contain all kinds of infor-mation. Social Networks are an illustrative example: the nodes represent the people and the connections between the people are represented by edges between the vertices. By analyzing its structure, it is possible to extract information from the graph, as a form of data-mining. Such operations lie within the field of network analysis. Figure 1.1 shows a small example of a visualized graph. The nodes represent the people and the edges the relationship between the people. The node becomes larger if the node is within the shortest path to connect people.

Figure 1.1: Social network graph[19]

When the graph approaches or even exceeds memory limits, it becomes harder and possibly prohibitively expensive to process if not partitioned. The way to proceed is to partition the graph such that the partitions can be loaded into RAM/GPU.

When partitioning a graph, the nodes and the edges must be evenly divided in order to have a good workload balance. This is required to maintain maximal efficiency when using distributed computing. Most prior work in this area was done by mathematicians that primarily looked at achieving this balance by evenly distributing the workload among the partitions [9]. The partitioning algorithms used to achieve this goal are often complex. The use of such algorithms results in computationally intense processing of the graph which takes a substantial amount of time. Less complex algorithms can be used to reduce this time, at the cost of a less balanced workload of the partitions.

(8)

There are two main efficiency indicators for the resulting partitions, the equal distribution of workload and the cross partition communication. If the workload is balanced and the cross partition communication is low, then the partitioning of the graph is considered efficient. Cross partition communication(CPC) is the communication between nodes that reside in different partitions. For example, when a node is to communicate with a neighboring node in another partition, this partition must be loaded into GPU memory or communicate with different ma-chines within a cluster in order to get the required information. This is a time consuming process on these types of modern day architectures, thereby making CPC a heavier weighing factor in creating efficient partitions.

The focus in the present thesis lies on partitioning graphs in two, four and six parts. Two part partitioning is chosen because this is the only way to compare K-way partitioning (a par-titioning method that makes K partitions) with a Bi-parpar-titioning method (which always creates two partitions). The partitioning of the graph in four and six parts is used to illustrate the difference in efficiency if the partitioning method increases the amount of wanted partitions.

We will assume that the most efficient way to partition all types of graphs is not limited to a single partitioning algorithm. Complex partitioning algorithms will create a better workload balance and less complex partitioning algorithms are computationally faster. This thesis will focus on the importance of complex partition algorithms versus relatively simple partitioning methods on big graphs. Five big graphs with varying attributes are used to examine this. These graphs are partitioned by five partitioning algorithms:

• METIS

• METIS K-Way • Spectral partitioning • Node partitioning • Edge partitioning

Most relevant literature only considers optimal workload distributions in partitioning graphs [9]. This project, however, uses a different angle. The primary questions when studying the above five schemes are:

• Is there a good way to compare the efficiency of the five partitioning algorithms? • Does the difference in efficiency justify the use of less complex algorithms?

The partitioning algorithms will most likely yield different results depending on attributes of the input graph.

(9)

Bachelor Project Partitioning of Big Graphs

Approach

In order answer the research questions, a test environment must be developed. This test envi-ronment contains both the five different partitioning methods discussed in the previous section, Pagerank and Breadth First Search. The test environment is discussed in detail in the test en-vironment section.

After the partitions are created, the distribution of nodes and edges within the partitions and between the partitions are calculated. This offers insight in how the partition proportions deviate from each other. These partitions are then used in two different validation algorithms, Pagerank and Breadth First Search, to see the difference in communication between the partitions.

This project will focus on partitioning using the partition values in the nodes of the full graph. Partition algorithms and distributive algorithms will initially be made for smaller graphs. Then bigger and different type of graphs with different attributes will be used to test whether the code is still producing accurate results.

(10)

(11)

CHAPTER 2

Network Theory

Graphs hold all kinds of information. Some examples of the use of graphs are the representation of social networks, electrical networks or road maps. Lines and points are used to visually represent a graph. In a social network the people are represented by points and the connections/relations between those people by the lines. In graph theory the points and lines are usually called nodes (or vertices) and edges. It can occur that edges have a direction, this means the edge is only going in the direction the arrow is pointing at. A graph like this is called a directed graph, or more commonly a digraph. A graph with no direction in the edges is called a undirected graph. There are two types of undirected graphs, simple graphs and multigraphs. Simple graphs do not contain loops, an edge from the same node to the same node, or multiple edges from one node to another. Multigraphs contain multiple edges to other vertices. In this thesis only simple graphs will be examined. These are the most common type of graphs and a selection must be made because of the limited time frame. An example of a network with nodes and edges in the form of a undirected and directed graph can be seen below: (Figure 2.1).

Figure 2.1: Example network of an undirected and a directed graph [17]

Notice that these graphs are not identical. The directed graph has less traversal options than the undirected graph. This is due to the fact that the directions in the edges limit the traversal possibilities. An undirected graph can be easily transformed into a directed graph. This is achieved by adding edges in both directions. In graph theory an edge in a digraph is called an arc. The edge notation is not interchangeable like in undirected graphs: 1,2 is not the same as 2,1, while the notation in the undirected graph is manifestly written as (1.2). A weight can be assigned to edges of graph, e.g. when a network represents the map of a road with the weights of the edges equal to the length of the road expressed in kilometers. This adds information to

(12)

the graph that can be used for future computational purposes, such as finding the fastest route from A to B. A graph with weights is called a weighted (di-)graph. An unweighted graph can be transformed into a weighted graph by giving all edges/arcs a weight of one. A directed graph can also be a weighted graph (Figure 2.2).

Figure 2.2: Weighted digraph [7]

We might want to use a graph like this for computational purposes. When going from one node to another, a certain route must be followed. For example when looking at Figure 2.2, the route from P,U can be (P,S)(S,U, (P,Q),(Q,S),(S,U) or (P,Q),(Q,R),(R,U). Such a route is called a path. A common metric to use in graph theory is the shortest path. This is the minimal path between two nodes. It can be expressed as min(P

iedgeweight(i)) with i being the collection of

edges one must traverse to reach the destination node. In (Figure 2.2) all the nodes seem to be connected to each other. However, when taking multiple starting points in consideration, not all nodes can be reached. For example when taking P as a starting point, all nodes Q,S,T,U,R can be reached. However when we take S as an starting point only T,U can be reached. Such a graph is not connected or disconnected. In a connected graph, there must exist a path between any chosen two nodes in the entire collection of nodes.

All graphs contain subgraphs. A subgraph of a graph X is a graph whose collection of vertices is a subset of the collection of vertices of X and whose collection of edges is a subset of the collection or edges of X.

Figure 2.3: Subgraph [1]

In (Figure 2.3), G2 is the subgraph of G1. Which means that G1 is the supergraph of G2.

An important property of nodes in a graph is their degree. The degree of a node is the number of edges that is connected to the node. In a digraph we usually speak of in- and out degree. The in degree is number of edges that are pointing towards the node. The out degree is the number of edges pointing outwards to other nodes in the graph. When a node has degree zero, the node is called an isolated node. If two edges, edge1 and edge2, have a common node A, the edges are called incident. When looking at a node, the degree has a direct link to the

(13)

number of incident edges (to that node). If for two nodes A and B there is an edge e joining them, we say that A and B are adjacent. A graph can be described in an adjacency matrix. This is a matrix in which the connectivity of the all nodes in the graph is described. An adjacency matrix A of a graph G with N nodes is a matrix with the size of N × N ; the column size and the row size equals the nodes in the graph. The position within the matrix Aij represents the

connectivity of node j with node i. If there is an edge between node j and node i then the value in matrix position Aij will be one, in all other cases it will be zero. This can be described as such:

Aij =

(

1 if there is an edge from node j to node i 0 other cases

In a undirected graph, the value on the position in the adjacency matrix of Aij is the same as

the value on the position Aji. In a digraph these values can be different as the value will only be

one is there exist an edge from j to i. In loop and multi graphs adjacency matrices can contain other values. As this thesis will not consider these graphs, we will stick with this definition of the adjacency matrix. This is a simple example of an adjacency matrix (Figure 2.4).

Figure 2.4: Adjacency Matrix [2]

A different way to represent the connectivity of a graph is with an incident matrix. The incident matrix A is a N × M matrix, where N is the number of vertices and M the number of edges. The position within the matrix Aij represents the connectivity of node i the edge j. If

there is a connection between node i and edge j, which means they are incident, then the value in matrix position Aijwill be one. In all other cases it will be zero. This can be described as such:

Aij =

(

1 if there is an edge i going towards node j 0 other cases

The degree matrix is a matrix that can be constructed from the adjacency matrix. This can be done by constructing a degree vector of 1 × N . This vector contains at element i the sum of the row of the column number j of the adjacency matrix. The degree matrix is then constructed by multiplying the degree vector with the identity matrix of size N × N . With the degree matrix D and the adjacency matrix A, the Laplacian matrix L can be constructed. This can be done by subtracting A from D.

(14)

The Laplacian matrix can be defined by the following set of rules (Figure 2.5) Aij =      k(i) if i=j −1, if ∃ e(i,j) 0 otherwise

Figure 2.5: Laplacian Matrix

The Laplacian matrix has the entry −1 where the adjacency matrix has a non-zero value. All the degrees are represented on the diagonal. All other values are zero.

(15)

CHAPTER 3

Partitioning Algorithms

There are a lot of partitioning methods that can be used to partition graphs. In this thesis five of them are discussed thoroughly. Every partitioning method has its pros and cons. Complex partitioning algorithms take longer to compute the partitions. However, they create partitions which are often more balanced and/or minimize the cross partition communication better then less complex partitioning algorithms. In general, the goal is to reduce the computational time of a process. Consequently, in some cases a less complex algorithm would be preferable over a more complex one.

Consider an example to elaborate this point a little further. The time to compute a large, well partitioned graph with a good workload distribution and minimal edge connections could be quite high, say 3 hours. Estimating 30 minutes time to compute results with the use of these partitions, this adds up to a total of 3.5 hours. If, on the other hand, we use a less refined partitioning method, the partitioning process itself might take an hour and the subsequent time to compute results 1.5 hours with a total time of 2.5 hours. In this case, it would be profitable to use the less refined partitioning method. Yet, if we want to do multiple computations using the partitioned graphs, a more refined method would be better. A widely used approach to partition big graphs is the complex METIS algorithm.

3.1 METIS

METIS is a partitioning method used in various branches of science and in business [3]. This partitioning algorithm focuses on minimizing the number of cross partition edges and distributing the workload evenly among the partitions. However, this partitioning method takes longer then partitioning with node or edge partitioning. These partitioning methods will be explained in the next sections. The issue is generally considered to be a NP-hard problem [10]. With METIS, graphs are partitioned in three phases. The first phase is the coarsening phase, the second the partitioning phase and the third and final phase the uncoarsening phase (Figure 3.1). The partitioning phase contains both bi-partitioning and K-way partitioning. Unlike K-way partitioning, bi-partitioning is done recursively. The subsections below contain a more extensive explanation of these various phases.

(16)

Figure 3.1: METIS Phases [11]

3.1.1 Coarsening Phase

The main goal of the coarsening phase within the METIS algorithm is to reduce the original large graph to a smaller graph, and still contain enough information to make a good partition. This happens with the use of a matching algorithm. Several matching algorithms are used for different types of graphs. A matching algorithm collapses vertices that are incident on each edge of the matching v, u ∈ V1 → w ∈ V2 [8]. The resulting weight of the vertices is the sum of the

collapsed vertices. Only two incident vertices may be collapsed at a time. If vertices are not incident, the vertices will be copied to the newly coarsened graph. This method is used until the desired size is achieved or the desired number of iterations is performed. The graph is usually coarsened down to a few hundred vertices. There are several matching algorithms, the most prominent ones being heavy matching (Figure 3.2) and random matching.

Figure 3.2: Heavy Edge Matching [10]

The heavy matching (HEM) algorithm initially matches the heaviest incident weighing edges. This has a positive effect on the number of graphs that need to be created before the final graph can be partitioned, which in turn has a positive effect on the time of the uncoarsening phase. However, the heaviest weighing edges need to be found first.

The random matching algorithm randomly selects incident vertices and matches them. This matches vertices faster then the heavy matching algorithm. However, it potentially needs to create more coarsened graphs before the final graph can be partitioned. Literature states that

(17)

HEM is the most efficient algorithm to use when coarsening graphs [8]. This matching algorithm will be used from hereon in this thesis.

3.1.2 Partitioning phase METIS

The second phase in the METIS algorithm consists of partitioning the coarsened graph. METIS has four different build-in partitioning algorithms: spectral bisection, Kernighan-Lin, graph grow-ing and greedy graph growgrow-ing. The goal is to compute a minimum edge cut bisection. This implies dividing the graph in two balanced parts while minimizing the number of edges to be cut. Actually, this is done almost instantaneously because of the small size of the coarsened graph. This graph reflects all the weights of the edges that are collapsed in the right places. Therefore it contains enough information to be able to compute an accurate bi-partition. Spec-tral partitioning will be thoroughly discussed in the SpecSpec-tral Graph Partitioning section below.

The Kernighan-Lin algorithm (KL) has a O(N2_{log N ) complexity [5]. This frequently used}

partitioning and refinement algorithm was actually one of the first algorithms to be created. It can be seen like spectral partitioning a bi-partitioning algorithm. The Kernighan-Lin algorithm creates two clusters of a predefined size or -in this case- operates on two previously created clus-ters. The KL algorithm keeps track of the number of edges communicating within the clusters and the number of edges communicating outside the cluster. The ratio of these statistics is the gain of partitions:

gain = (communication within the partition)_{(cross parititon communication)}

The KL algorithm keeps swapping subsets of vertices in order to maximize the communication in each cluster. This process can be seen in (Figure 3.3)

Figure 3.3: Kernighan-Lin [18]

The third partitioning algorithm is the Graph Growing Partitioning algorithm (GGP). This algorithm starts with a node and adds the region around it. This is done until the set size of the partition is reached. It adds vertices in a Breadth First Search manner (BFS): this BFS algorithm will be explained in the Experimental validation chapter. GGP has a high performance variation depending on the initial node. This is because different initial vertices yield different edge cuts. However when combining this partition algorithm with further refinement in the uncoarsening phase, the partitions will become significantly more balanced, as can be seen from the fact that the gain increases. The final partitioning algorithm implemented in METIS is an addition to the previous partitioning algorithm, the Greedy Graph Growing Partitioning algorithm (GGGP). This partitioning algorithm has a greedy nature. Like the Kernighan-Lin algorithm, the GGGP keeps track of the gain as it adds the vertices with the highest computed gain. This gain in turn is added the frontier, the growing region. The graph cut of this algorithm is better than that of the GGP. The downside is the increase of the partitioning time as the gain needs to be computed. The last three partitioning algorithms are all based on graph growing heuristics. As the coarsened graph is of such a small size, the partition algorithms all produce comparable partitions [9]. Some partitions are better then others, like the GGGP compared to the GGP. METIS will choose the best partitioning method based on the graphs attributes. However, this can be compensated in the uncoarsening phase by the refinement algorithm. This

(18)

will be discussed in the next subsection. METIS K-way partitioning uses the same partitioning algorithms, but recursively until it reaches K number of partitions. Another approach of METIS K-way partitioning is to coarsen down to K edges. However, this is likely to make the partitioning unbalanced, as the weights of the edges may be very different. The second problem is that it is very expensive [9].

3.1.3 Uncoarsening Phase

In the uncoarsening phase, METIS unpacks the condensed graph created by the matching algo-rithms. This is done in various phases. In Figure 3.1, the matching algorithm is applied four times before the graph is small enough for transition into the partitioning phase. The states G0 − GN , in which N is the number of coarsened graphs, are stored within memory. This is needed in order to unpack the graph, which is done in the following manner: GN will uncoarsened down to the initial graph G0, while remaining the split in partitions. This is done recursively. GN − 1 will be compared to the corresponding coarsening graph GN − 1 while maintaining the partition split. After unpacking, a refinement algorithm will be applied in order to ensure a good partition balance. In the uncoarsening phase, METIS uses a refinement algorithm from the Kernighan-Lin (KL) algorithm [5]. The Kernighan-Lin algorithm keeps swapping subsets of vertices in order to maximize the communication within each cluster. This method will mini-mize the cross partition communication. This is done in a relatively small amount of time, it substantially decreases the edge cut with a small number of iterations [9]. When focusing on maximizing communication within the clusters, the resulting partitions contain a high average degree. If there is no refinement method applied in the uncoarsening phase, the uncoarsening phase only has to compare the GN and the G0 graph. This might speed up this phase, however the resulting partition might become highly unbalanced as a result.

3.2 Spectral Graph Partitioning

Spectral graph partitioning is a linear algebra based partitioning method. Spectral methods for graph partitioning are however computationally expensive algorithms. This partitioning algo-rithm was implemented using the networkx and numpy libraries of python2.7. It computes the Laplacian matrix L of the large input graph. This can be obtained as mentioned in the above (Figure 2.5). Next, the eigenvalues and the eigenvectors of this matrix are obtained by straight-forward linear algebra. The eigenvalues (and their corresponding eigenvectors) are then ordered as a function of magnitude: λ0= 0 ≤ λ1≤ λn−1. The eigenvector corresponding to the smallest

eigenvalue consists exclusively of ones. This vector is of no practical use as no useful information can be retrieved from it. The second smallest eigenvector, however, contains the information needed to make the spectral partition. This vector is also known as the Fiedler vector [4].

v−T · L(G) · ~v = X i,j∈E (Xi− Xj)2 λ1= min ~ v⊥(1,...,1) _v−T_·L(G)·~_v v−T_·~_v

When the corresponding eigenvalue of the Fiedler vector of the Laplacian matrix is com-puted, one can see whether the graph is connected or not. This number is also referred to as the algebraic connectivity [6]. This is useful as spectral partitioning can only be effective if the graph is connected. This is because spectral partitioning focuses on minimizing the edge cut. In an disconnected graph this would be zero. Once the Fiedler vector is found, the mean value is computed:

Pivot = _{(Number of components)}P (Fiedler components)

With this pivot, the nodes that correspond with the eigenvalues can be split into two parts. All the nodes with the corresponding eigenvalues below the pivot are partitioned in partition A

(19)

and all other nodes will be put in partition B. With this method two partitions are created, a procedure called bi - partitioning.

The above algorithm can be called recursively to create a predetermined number of partitions. The partition sizes are computed such as to minimize the cross partition communication. This indicates that the number of partitions is limited to an even number. The reason behind this is that the recursive call will be bound by a threshold set by the partition size. If the partition size is above the threshold, it would automatically invoke the spectral partitioning on that partition. However, spectral partitioning is not perfect and thus does not create perfectly workload balanced partitions. It is possible to create an odd number of partitions this way. In theory spectral graph partitioning can not be effectively applied to directed graphs and disconnected graphs, as this will lead to severely unbalanced partitions. For purposes of comparison, the performance of spectral partitioning will also be calculated on graphs that are disconnected and/or directed.

3.3 Node partitioning

Node partitioning is implemented in this thesis mainly for purposes of comparison. It is the most basic kind of partitioning that most likely creates highly unbalanced partitions. The algorithm computes the partition sizes by dividing the number of nodes in the graph by the number of required partitions. Next, it loops through the graph, systematically adding nodes to the partitions. Once the partition is full, the next partition will be filled without considering any graph attributes, gain, vertices or other types of information. This can be useful if the input graph has a non weighted balanced degree distribution. Only then will the partitions be well balanced. Node partitioning is most likely the fastest partitioning algorithm there is, with a complexity of O(N ).

3.4 Edge partitioning

Edge partitioning is still a widely used partitioning algorithm [13]. It is for this reason, that the algorithm is selected for implementation. Similar to node partitioning, it is a very fast partitioning algorithm, but now edges are the deciding factor. Like in node partitioning the partition size is calculated by the number of partitions the user wants. This is done by dividing the total number of edges with the number of partitions. Every time a node is added to the partition, the degree number of incoming edges is added to the total degree. If the total degree is equal to or higher then the partition size, a new partition is made, in which the remaining nodes are then added in the same structural manner. This partitioning method requires a different partitioning method for directed graphs then for undirected graphs. In undirected graphs, the in-degree is not given, as this is the degree is the node. Therefor the edge must be removed after the degree is added to the total degree, to ensure that the same node is not added twice.

(20)

(21)

CHAPTER 4

Experimental Validation

4.1 Algorithms

To be able to benchmark the different types of partitioning algorithms discussed in the previous chapter, we implement two algorithms, Pagerank and Breadth First Search. These algorithms take the partitioned graph and log the cross communication for that algorithm. Knowing the difference in cross partition communication between the partitioning algorithms is crucial to be able to say anything about the efficiency of the partitioning algorithm. This is needed in order to be able to reevaluate the importance of complex partitioning algorithms such as METIS and spectral partitioning.

4.1.1 Pagerank

Pagerank is a ranking algorithm that is used by Google to rank web pages [12]. Pagerank creates a web page ranking by evaluating the rank of all the incoming links to that page. If a web page had many incoming links, hyperlinks linking towards the page, the webpage will be ranked higher. The outgoing links do not have any effect, and yield no indication of any importance. This algorithm can also be applied to graphs. Pagerank is a good way to effectively capture the relative importance among nodes of a graph, and is also applicable with weighted edges. For example, if there is a link from a big company to your website, the ranking will increase more then when there is a link from a small website. The weight of the link depends on the Pagerank of the node:

Pagerank website =P (Pagerank incoming edges) (Number of links to the page)

This can be mathematically formulated as :

PR(W) = (1 − d) + d ·P(PR(V))

(N(V))

If a ranking changes, for example when a node is added or removed from the graph, the algorithm will reevaluate every ranking in the graph. The new rankings are stored in a buffer. When the Pagerank algorithm is finished computing all the required values for the nodes, the buffer values will overwrite the current ranking values. This is needed in order to ensure the consistency of the graph ranking values. In fact, if the value of a node A would be updated prematurely, a neighbouring node B could get the wrong ranking value of the node A. Pagerank chooses a starting node from which to propagate through the entire graph. In theory, all the neighbouring nodes of the starting point would have to calculate the ranking at the same time. However this is done one at a time as it does not concern a multi-threaded implementation.

(22)

4.1.2 Breadth First Search

The second algorithm is Breadth First Search (BFS). BFS is a different kind of algorithm: as will be discussed later on in more detail, it has less iterations then Pagerank and goes through the graph in a different manner. It will therefore likely lead to a different partition communica-tion then Pagerank. For analytic purposes a secondary algorithm that requires communicacommunica-tion between nodes is useful. The reason for this is that different types of graphs behave differently with these algorithms. The results will be discussed in the below Result chapter. Breadth First Search constructs a structural path though the graph. It starts at the initial node and then traverses through the graph layer by layer. The layer representing the level of depth the node is currently one, which is also the minimal number of steps removed from the initial node (the shortest path).

Figure 4.1: Breadth First Search [14]

As can be seen in Figure 4.1, it starts at the initial node or parent. Subsequently, the children of the initial node are visited. Next, the children become the parents and their children will be visited in a structural manner. If a child has already been visited, it will not be visited again.

(23)

CHAPTER 5

Results

We ran the Pagerank and the BFS algorithms on the various partitioned graphs. The resulting cross partition communication between partitions was logged in an external file. This is done with the five different types of graphs and the five partitioning algorithms mentioned earlier. This file is then parsed with a python parser that uses regular expressions. The five big graphs used from the Stanford SNAP database are given below:[16].

Graph Number of Nodes Number of Edges ego-Facebook 4039 88234

wiki-Vote 7115 103689 ca-HepPh 12008 118521 p2p-Gnutella31 62586 147892 loc-Brightkite 58228 428156

Table 5.1: Graphs and their Sizes

All the graphs have different attributes, they are selected from five different categories of the Stanford SNAP database. The underlying reason is to be able to give a good overview of the effect of all the partition algorithms on the five different types of graphs. Then we are able to assess the efficiency of each particular partitioning algorithm in relation to the type of graph it is used on.

Graph Diameter 90-percentile effective diameter Type ACC 1 ego-Facebook 8 4.7 Undirected 0.6055

wiki-Vote 7 3.8 Directed 0.1409

ca-HepPh 13 5.8 Undirected 0.6115

p2p-Gnutella31 11 6.7 Directed 0.0055 loc-Brightkite 16 6 Undirected 0.1723

Table 5.2: Graph Attributes

This table gives a good indication of the graph connectivity. This is given by the diameter, the 90-percentile effective diameter and the ACC. Also stated are the type of graph and whether the graph is fully connected or not. The diameter is also known as the longest shortest path. This is the smallest maximum number of steps that a node is located from another node. The 90-percentile effective diameter is the average distance between a node and every other node in 90% of the cases. The last attribute is the Average Cluster Coefficient. A cluster coefficient is a quan-tity that indicates the degree of clustering within a graph. The cluster coefficient is calculated as shown in Figure 5.1. The average clustering coefficient is the normalized sum of all the clus-ter coefficients. This number gives an indication of the entire degree of clusclus-tering within a graph.

(24)

Clustering coefficient = The number of connections between the neighbors_{Out Degree}

Figure 5.1: Cluster Coefficient [15]

Efficiency can not be derived from just looking at the cross partition communication (CPC), as the workload balance of the partitions must also be taken into account. The combined work-load balance and cross partition communication is a good indicator for efficiency/performance. The load balance is based on the even distribution of the nodes over the partitions. For the sake of comparison, we are first looking at Bi-partitioning, whereupon the number of partitions grad-ually is increased by two. With these results we can compare the differences in workload balance and CPC in the five graphs with different number of partitions. This is needed to accurately estimate the efficiency of each of the partitioning methods on a particular graph.

The workload balance of the partition algorithm is extracted by analyzing the partitions. The number of nodes in each partition is calculated by looping through the partition list and sorting the partitions in a list corresponding to their partition number. The length of each of the partition lists is the number of nodes in their respective partition.

The number of edges in each partition is extracted by going through the list of edges. The partitions of the nodes of each of the edges are then compared with each other. If the nodes are equal, the partition in which the nodes reside will have an partition edge. The inter-partition edge will be counted separately for every inter-partition. If the nodes are different, there is a cross partition edge, which means that the number of intra-partition edges will be incremented.

The most important aspects to look at in the following result tables are primarily the balance of edges between the partitions, and secondly the number Cross Partition Edges (CPE). High cross partition communication can be a huge bottleneck in performance. Especially when the computer architecture requires partitions to be loaded in to and out of memory in order to communicate between partitions. The balance between nodes across the partitions has less effect on the workload, as the most important part is the communication between the nodes.

The results can be seen in the tables (Figure 5.3, 5.4, 5.5, 5.6, 5.7).

ego-Facebook Partition 1 Partition 2 Partition 1 and 2

Partition algorithm Nodes - Edges Nodes - Edges Cross Partition Edges (CPE) Node Partitioning 2019 - 40033 2020 - 40301 7900

Edge Partitioning 1984 - 39672 2055 - 40390 8172 Spectral Partitioning 3285 - 80849 754 - 40 7345 METIS Partitioning 2019 - 32152 2020 - 27984 28098 METIS K-Way Partitioning 1960 - 27920 2079 - 38687 21627

(25)

ca-HepPh Partition 1 Partition 2 Partition 1 and 2 Partition algorithm Nodes - Edges Nodes - Edges CPE

Node Partitioning 6004 - 195049 6004 - 23405 18556 Edge Partitioning 863 - 83036 11145 - 118464 35510 Spectral Partitioning 358 - 13328 11650 - 223476 206 METIS Partitioning 6004 - 57337 6004 - 61435 118238 METIS K-Way Partitioning 6180 - 64433 5828 - 54595 117982

Table 5.4: 2 partitions Workload Balance ca HepPh Graph

wiki-Vote Partition 1 Partition 2 Partition 1 and 2 Partition algorithm Nodes - Edges Nodes - Edges CPE

Table 5.5: 2 partitions Workload Balance Wiki Vote Graph

p2p-Gnutella31 Partition 1 Partition 2 Partition 1 and 2 Partition algorithm Nodes - Edges Nodes - Edges CPE

Table 5.6: 2 partitions Workload Balance p2p Gnutella31 Graph

loc-Brightkite Partition 1 Partition 2 Partition 1 and 2 Partition algorithm Nodes - Edges Nodes - Edges CPE

Table 5.7: 2 partitions Workload Balance LOC Brightkite Graph

From all the 2 partitions workload distribution tables of the various partition algorithms, we can concluded that Spectral partitioning has by far the worst performance. This is quite un-derstandable. Spectral partitioning focuses on minimal communication between partitions and less on workload balance. It assigns a heavy weight to cross partition edges. Therefore Spectral partitioning has substantially less cross partition edges then the other partitioning algorithms. On the other hand, the workload balance is substantially worse then in the other partitioning algorithms. This makes the algorithm less useful in partitioning big graphs. The recursive form of METIS, METIS Bi-partitioning, has generally the best performance for 2 partition workloads distribution, although Node and Edge partitioning are even better for the ego-facebook graph. Looking at the workload balance in the ca-HepPh and p2p-Gnutella31 graphs, Edge partition-ing might even be a better alternative to METIS, considerpartition-ing the lower CPE and the lower run time of the partitioning algorithm. The partitioning algorithm that is most efficient can be calculated by our own efficiency formula that will be explained in the end of this section (figure 5)

(26)

When all the nodes in the graph have the same number of incoming and outgoing edges, Node and Edge partitioning would be perfect. If this is not the case and the graph contains higher degree nodes, then these nodes need to be divided equally among the number of par-titions in order to ensure a good workload balance. The workload in the ego-facebook graph is almost perfectly divided by Node and Edge partitioning, which is logical because the graph meets at least one of the previously mentioned cases. It is very likely that Node and Edge partitioned graphs will become highly unbalanced if the graph contains several nodes with a de-gree substantially higher or lower than average. If the dede-gree is substantially higher, the node is called a supernode. When looking at the Node and Edge partitioning at the wiki vote graph, is it obvious that there are one or several supernodes present, making the partition highly unbalanced.

In most cases, workload balance is not the only criterium. As mentioned before, a high cross partition communication can be a huge bottleneck in performance. The CPC is measured by running the Pagerank and Breadth First Search algorithms on all five Bi-partitioned five graphs. This CPC gives a better indication then the CPC calculated from the workload balance, because these are actual real life scenarios.

The cross partition communication of all graphs and all partitioning algorithms can be seen in the following bar plots Figure 5.2, 5.3:

(27)

Figure 5.3: 2 Partition Breadth First Search

As shown in the two figures, the CPC of METIS and METIS K-Way is higher then the other partitioning methods. This is because METIS mainly focuses on workload distribution. The other reason is that Node/Edge partitioning do not randomly take nodes/edges from the graph to fill the set number partitions until they are full but take the neighbours of the nodes and edges.

Cross partition communication of Spectral partitioning is noticeably very low, with the ex-ception of the ego-Facebook graph. This would be big improvement if one would not take the workload balance in account. However, the workload balance is very important when par-titioning big graphs, making Spectral parpar-titioning a not viable option under these circumstances.

The process of calculating the workload and the CPC of the five graphs will be repeated for four partitions. As Spectral partitioning is a Bi-partitioning algorithm and —as concluded earlier— not a viable option in partitioning big graphs, the algorithm will not be used in the K-way partitioning of the graphs. The results of the workload balance of the graphs partitioned in 4 parts can be seen in the tables Figure 5.8, 5.9, 5.10, 5.11, 5.12:

ego-Facebook Partition 1 Partition 2 Partition 3 Partition 4 CPE Algorithm Nodes - Edges Nodes - Edges Nodes - Edges Nodes - Edges Edges Node 1009-12845 1010-12449 1010-16348 1010-17527 29065 Edge 754-6527 895-15170 1294-18873 1096-21970 25694 METIS 1009-17804 1010-8312 1010-12154 1010-11212 38752 METIS K 1030-10900 1031-14063 980-6727 998-12438 44106

Table 5.8: Workload Balance 4 partitions ego-Facebook Graph

wiki-Vote Partition 1 Partition 2 Partition 3 Partition 4 CPE Algorithm Nodes - Edges Nodes - Edges Nodes - Edges Nodes - Edges Edges Node 1779-50799 1779-2384 1779-164 1778-71 50271 Edge 589-10225 383-4883 447-4820 5696-15649 68112 METIS 1779-41661 1779-3985 1779-569 1778-445 57029 METIS K 1726-38830 1831-4793 1726-738 1832-242 59086

(28)

p2p-Gnutella31 Partition 1 Partition 2 Partition 3 Partition 4 CPE Algorithm Nodes - Edges Nodes - Edges Nodes - Edges Nodes - Edges Edges Node 15646-27416 15646-15565 15646-16158 15648-19135 69618 Edge 9396-15496 13960-14017 16610-17475 22620-19758 81146 METIS 15646-14058 15647-12373 15646-9557 15647-10090 101814 METIS K 15998-10957 15999-10263 15284-10931 15305-14138 101603

Table 5.10: Workload Balance 4 partitions p2p-Gnutella31 Graph

loc-Brightkite Partition 1 Partition 2 Partition 3 Partition 4 CPE Algorithm Nodes - Edges Nodes - Edges Nodes - Edges Nodes - Edges Edges Node 14557-214952 14557-25660 14557-8540 14557-10254 168750 Edge 1460-23180 3644-17016 6853-37824 46271-106966 243170 METIS 14557-38160 14557-37040 14557-17450 14557-33364 302142 METIS K 14993-35528 14131-40152 14971-41476 14133-12460 298540

Table 5.11: Workload Balance 4 partitions loc-Brightkite Graph

ca-HepPh Partition 1 Partition 2 Partition 3 Partition 4 CPE Algorithm Nodes - Edges Nodes - Edges Nodes - Edges Nodes - Edges Edges Node 3002-149352 3002-23903 3002-11410 3002-8935 43410 Edge 451-13961 413-39377 1875-37400 9269-58958 87314 METIS 3002-14682 3002-13997 3002-13418 3002-17487 177426 METIS K 2914-13113 3090-15458 3090-16986 2914-13567 177886

Table 5.12: Workload Balance 4 partitions ca-HepPh Graph

As in Bi-partitioning, METIS has the overall best workload balance, especially in the undi-rected loc-Brightkite and ca-HepPh graphs. However, like in the workload balance results of Bi-partitioning, Node and Edge partitioning performs better in the ego-facebook graph. Edge partitioning also performs better in the Wiki-vote graph and even performs reasonably in the p2p-Gnutella31 graph (the two directed graphs). This suggests that Edge partitioning is a viable option to 4-Way partition directed graphs. In the workload distribution of the Bi-partitioning, the Edge partitioning was mentioned as having one of the most unbalanced partitions in the wiki graph. As this is not the case anymore now, it can only mean that the supernodes have been dispersed over the partitions while Node, METIS and METIS K could not handle these nodes efficiently. Looking at the extremely unbalanced edges in Node partitioning, the supernodes present in the wiki graph are practically all in the first quarter of the graph nodes.

To make edge partitioning even better in graphs that contain supernodes, the supernodes should be copied to all partitioning machines. When the partitions need to communicate with the supernode(s) it will occur locally. After all the partitions are done, the results will be merged, thereby minimizing the CPC and balancing the workload.

Pagerank and BFS will be evaluating the CPC of the five graphs in four partitions. The cross partition communication of all graphs and all partitioning algorithms can be seen in the following bar plots Figure 5.4, 5.5:

(29)

Figure 5.4: 4 Partition Pagerank

METIS and METIS K still have the highest CPC of the four partitioning methods. When compared to the previous CPC graph, the difference between the CPC of METIS and the other two partitions is smaller in practically all graphs, regardless of running on BFS or Pagerank. However, there is still a big difference in the ca-HepPh, the ego-facebook and the loc-Brightkite graph. These are all undirected graphs. Noteworthy is the low CPC in the wiki-Vote graph. This is likely due to the unbalanced workload in this directed graph. Looking at the workload balance and the CPC, METIS is a good partitioning method to partition undirected graphs in four parts. Edge Partitioning is a viable option to partition directed graphs or undirected graphs that have a minimal degree variation, which means that the degree of every node is more or less consistent with the average degree (such a ego-facebook).

The previous process will be repeated for six partitions. The results of the workload balance of the graphs partitioned in six parts can be seen in the tables (Figure 5.13, 5.14, 5.15, 5.16, 5.17).

(30)

ego-Facebook Partition 1 Partition 2 Partition 3 Partition 4 Partition 5 Partition 6 CPE

Algorithm Nodes - Edges Nodes - Edges Nodes - Edges Nodes - Edges Nodes - Edges Nodes - Edges Edges

Node 673-5223 673-12399 673-7765 673-8822 673-12060 674-4860 37105

Edge 561-3917 461-5032 623-7891 749-6256 676-8220 969-14603 42315

METIS 673-7112 674-13179 672-4429 673-6723 674-4840 673-4204 47747

METIS K 672-4256 693-6205 654-5646 693-13483 653-6255 674-4282 48107

Table 5.13: Workload Balance 6 partitions ego-Facebook Graph

wiki-Vote Partition 1 Partition 2 Partition 3 Partition 4 Partition 5 Partition 6 CPE

Node 1186-31315 1186-8914 1186-295 1186-76 1186-45 1185-28 63016

Edge 411-5460 313-3204 250-2499 275-2100 368-2220 5498-9348 78858

METIS 1185-450 1186-1873 1186-6744 1186-2421 1187-6133 1185-2179 83889

METIS K 1152-14893 1221-805 1221-8163 1152-979 1155-924 1214-510 77415

Table 5.14: Workload Balance 6 partitions wiki-Vote Graph

p2p-Gnutella31 Partition 1 Partition 2 Partition 3 Partition 4 Partition 5 Partition 6 CPE

Node 10431-17367 10431-8904 10431-8154 104310-8476 10431-9783 10431-12085 83123

Edge 5708-8785 7931-7203 9717-7420 10740-8729 12462-11598 16028-9783 94374

METIS 10431-4598 10431-4162 10431-3962 10431-4026 10431-4296 10431-4167 122681

METIS K 10451-4468 10423-4287 10421-4203 10418-4079 10453-4039 10420-4186 122630

Table 5.15: Workload Balance 6 partitions p2p-Gnutella31 Graph

loc-Brightkite Partition 1 Partition 2 Partition 3 Partition 4 Partition 5 Partition 6 CPE

Node 9705-143050 9705-28750 9705-13058 9705-4236 9704-3918 9704-6590 228554

Edge 626-9504 1550-10046 2933-8106 4208-13564 8886-29214 40025-71258 286464

METIS 9704-17908 9705-22200 9705-11820 9704-6382 9705-9060 9705-22862 337924

METIS K 9443-12432 9898-24370 9996-15960 9447-5634 9996-25254 9448-9418 335088

Table 5.16: Workload Balance 6 partitions loc-Brightkite Graph

ca-HepPh Partition 1 Partition 2 Partition 3 Partition 4 Partition 5 Partition 6 CPE

Node 2001-134443 2001-16750 2001-15128 2001-7635 2002-6188 2002-5544 51322

Edge 402-9141 106-8604 356-23693 948-19983 2542-24876 7654-39171 111542

METIS 2001-6213 2002-5945 2001-6911 2001-6961 2001-6166 2002-7282 197532

METIS K 1992-7010 2012-5649 2058-8452 1942-5775 2061-6657 1943-6243 197224

Table 5.17: Workload Balance 6 partitions ca-HepPh Graph

The workload balance of the five graphs in six partitions is comparable to the workload balance of four partitions. METIS and METIS K is still more efficient in the ca-HepPh and loc-Brightkite graph, while Edge partitioning is better in the Wiki-vote and ego-facebook graph. Both partitioning algorithms perform reasonably well on loc-gnutella. Node partitioning is very unbalanced in all the graphs except for the ego-facebook graph, for discussed in the above.

The CPC of the five graphs will be evaluated by Pagerank and BFS. The cross partition communication of all graphs and all partitioning algorithms can be seen in the following bar plots Figure 5.6, 5.7:

(31)

Figure 5.6: 6 Partition Pagerank

As shown in the CPC plots of the five graphs, METIS and METIS K are still the highest in every graph. This means that the same conclusions can be derived from these workload tables and CPC plots as those from the graphs that were partitioned four times.

To give a quick overview of the most important results made by the five graphs partitioned in two, four and six parts, we constructed a small classification scheme. This scheme mostly involves Edge partitioning and the recursive form of METIS, as METIS K consistently produces worse results then METIS (recursive) and the partitions made by Node partitioning are in practically every workload extremely unbalanced. The overview consists out of three categories:

In the Edge Partitioning is most efficient category are two graphs, the ego-facebook and the Wiki-vote graph. Both graphs have produced even more balanced partitions then METIS and have lower CPC. The reason behind the good partitioning of the ego-facebook graph is that the

(32)

nodes have a minimal degree variation.

The directed Wiki-vote graph contains supernodes. Edge partitioning can in most cases decently handle partitioning directed graphs with supernodes, in contrast with node partitioning. The Bi-partitioning of the Wiki-vote graph is not as efficient as recursive METIS, but more efficient in K-Way partitioning. In addition, Edge partitioning is substantially faster then the complex METIS partitioning. As mentioned earlier in the result section, to be able to get the most out of Edge partitioning with supernodes, the supernodes can be tagged and copied to the partitioning machines, thereby reducing the CPC and balancing the workload.

There is one graph open for speculation, the directed p2p-Gnutella31 graph. When we eval-uate the workload balance of this graph, Edge partitioning has a more fluctuating workload balance but less CPC then METIS. Additionally, Edge partitioning is substantially faster then METIS, making the more efficient partitioning algorithm open for speculation here. To be able to express the difference in efficiency in the five partitioning algorithms, we developed an expression for undirected and directed graphs for all the partitioning algorithms. This formula —yielding the weighted average of the required computation times— is a good instrument to compare the difference in efficiency in the five partitioning algorithms. The following variables are needed:

• Tedges= The time it to compute an edge in a partition

• Tnodes= The time it to compute a node in a partition

• Tcommunication = The time it takes to communicate with another partition

• Heaviest partition = The partition that takes the longest time to compute

• Cross partition communication = the number of times heaviest partitions needs to com-municate with other partitions

• Nmax = number of nodes in the heaviest partition_{the number of nodes in the graph}

• Emax= number of edges in the heaviest partition_{the number of edges in the graph}

These variables can be combined into the formula that is able to express the difference in efficiency in the five partitioning algorithms. The heaviest partition is the partition that is the most unbalanced, thereby confiscating the most time. If we take the time that the nodes and edges take to compute and adding the time of the CPC of that partition, an accurate estimation of the efficiency can be made. The formula can be expressed as follows:

Efficiency = hT i = Tedges· hEmaxi + Tnodes· hNmaxi + Tcommunication· CP C hEmaxi + hNmaxi + CP C

.

Figure 5.8: Efficiency Formula

All the non time related variables can be computed from the workload balance of their re-spective graphs. Of course the workload balance of the required number of partitions must be chosen. The time related variables are architecturally dependent, e.g. different when partition-ing is done on a GPU then when it is done on different machines within a cluster. To use the formula, the three time-related variables Tedges, Tnodes and Tcommunication should be calculated

first on the user’s architecture and algorithm of choice.

In the last category, METIS Partitioning is most efficient, are the two undirected graphs loc-Brightkite and ca-HepPh. METIS is very efficient in partitioning undirected graphs. The only reason METIS was outperformed by Edge partitioning on the undirected ego-facebook graph is the minimal degree variation in the nodes. Although METIS is a more complex algorithm then Edge partitioning and has a higher CPC, the workload balance is substantially better in these two cases. As a result, METIS will be the most viable option to partition undirected graphs if the nodes of the graph do not have a minimal degree variation.

(33)

5.1 Test Environment

In order to compare the different partitioning algorithms with all types of graphs and then run the partitioned graphs on Pagerank and BFS, a test environment is made. This environment consists of multiple phases. The chart in (Figure 5.9) gives a quick overview of these phases.

Graph Partition Phase Experimental Phase Parse Phase Draw Phase Partitions Logging

Cross Partition Communication

Figure 5.9: Phase Chart of all processes

The process starts with creating the graph from a file which consists of a large amount of sets of two nodes. These sets represent the edges in the graph. After creating the graph, the partitioning phase begins. A partitioning algorithm creates a partitioned graph and the graph partitions will be send to the Experimental phase. Here the partitioned graph will be ranked by Pagerank or traversed by Breadth First Search in order to log the cross partition communication. When this phase is finished, the log contains a large number of parent-child communication strings that also contains the partition information of the communicating nodes. These will be parsed by the parser in the parsing phase in order to get the cross partition communication for the specific graph and partitioning method.

DAS4

VirtualENV

Screen

Figure 5.10: Test Environment Chart

This chart represents the environment in which the processes take place (Figure 5.10). The processes explained in the previous figure (Figure 5.9) take longer as the graph size increases. As only large graphs are used in this thesis, the time it takes to compute the cross partitioning of a graph with various partitioning algorithms and distributive algorithms varies between a couple of hours and a whole weekend. The sum of the time it would take to compute all cross partition communication is too high for the limited time frame in which the experiments must

(34)

be carried out. This is why all the calculations are done on a DAS4 supercomputer cluster of the Vrije Universiteit (VU). First, the required libraries are installed within the virtual environment virtualENV. This is unavoidable because a normal user does not have access to the system directory of the DAS4. The virtual environment enables the user to install libraries that are normally located in the system directory. The code can be executed after loading the required modules already present on the DAS4, and installing the dependencies of the code. Subsequently screen is installed, a program that allows the user to create several terminals within the DAS4 environment. This very useful tool that allows to run multiple processes on the DAS4 comes with a perhaps even more important option: when running, the user is able to detach from the screen and exit the DAS4 environment. This means that the user does not need to maintain a constant connection with the DAS4 during the time consuming computations.

(35)

CHAPTER 6

Conclusion

If the workload is well balanced, i.e. evenly distributed over the partitions, and the cross par-tition communication (CPC) is low, then the parpar-titioning of the graph is considered efficient. Several partitioning methods do not meet the requirements to partition big graphs efficiently. For instance, Spectral partitioning has noticeably the lowest CPC, with the exception of the ego-Facebook graph. However, Spectral partitioning creates partitions with highly unbalanced workloads because it focuses on minimal communication between partitions and less on work-load balance. It assigns a heavy weight to cross partition edges. The workwork-load balance is very important when partitioning big graphs, making Spectral partitioning not a viable option to do so.

When all the nodes in the graph have roughly the same degree, Node and Edge partitioning would be perfect. In other cases, if the graph contains nodes with a higher then average degree, these nodes need to be evenly distributed over the number of partitions in order to ensure a good workload balance. Node partitioning does not take nodes with an higher then average degree into account and, as a result, almost always ends up with highly unbalanced partitions.

To be able to give a good overview of the results, we created a small classification scheme. This scheme mainly focuses on Edge partitioning and the recursive version of METIS. This is because METIS K-way is repeatedly outperformed by METIS, and as previously mentioned, Spectral- and Node-partitioning are no good options to partition big graphs. The overview con-sists out of three categories:

Two of the five graphs can be placed in the first category. These are the ego-facebook and the Wiki-vote graphs. Graphs with a minimal degree variation like the undirected ego-facebook can be partitioned very well by Edge partitioning. Not only is the workload fairly well-balanced, but the CPC is lower then that of METIS as well. The directed graph Wiki-vote, which con-tains supernodes, can be handled decently by Edge partitioning. To be able to get the most out of Edge partitioning, the supernodes can be tagged and copied to the partitioning machines, thereby reducing the CPC and balancing the workload.

When considering the second category, the directed p2p-Gnutella31 graph is the only graph open for speculation. Edge partitioning is substantially faster then METIS and makes partitions with less CPC, but it creates partitions with a more fluctuating workload balance then METIS. In order to ensure the best choice for a partitioning algorithm, a procedure is developed to com-pare the efficiency of the five partitioning algorithms. This procedure is in the form of a formula

(36)

that yields the weighted average of the required computation times. Using this formula, the user can compare the efficiency of each of the partitioning algorithms. The formula can be expressed as follows:

Efficiency = hT i = Tedges· hEmaxi + Tnodes· hNmaxi + Tcommunication· CP C hEmaxi + hNmaxi + CP C

.

The two undirected ca-HepPh and the loc-Brightkite graphs can be placed in the third cate-gory. METIS would be a good overall choice for most undirected graphs, which can be seen when looking at the workload balance and CPC of the two graphs. It excels in partitioning undirected graphs.

When looking at all the directed graphs, Edge partitioning is a more attractive alternative as the workload balance is comparable with that of METIS and the CPC is generally lower. Ad-ditionally, Edge partitioning is a substantially less complex and faster algorithm then METIS. If supernodes are accounted for, Edge partitioning will definitely be a viable option to use to partition directed graphs or graphs with a minimal degree variation instead of the more complex partitioning algorithm METIS. In these cases the difference in efficiency justify the use of the less complex algorithm Edge partitioning.

(37)

6.1 Future Outlook

This project can be extended by dividing the full graph into partitions with the same algorithms, build a buffer and extend the test-environment in such a way that the partitions can communi-cate with each other on the DAS4. This would then also be done with Pagerank and Breadth First Search. In addition to that, multiple partitions could give an even more accurate picture of the efficiency differences between the partition algorithms. The time can be logged if all the processes are done on one machine. Machine timing for reasons of comparison and efficiency measurement is only meaningful when processes are run under exactly the same hardware and architectural circumstances. The time it takes to load a partition in GPU memory or to com-municate between different computers on a cluster can also be useful. If this is relatively large, the cross communication on a GPU or on a cluster is very expensive. This means that cross par-titioning would weigh heavier in making a perfect partition then the workload balance between partitions.

(38)

(39)

Bibliography

[1] Biomedical. Biomed central - maximum common subgraph: some upper bound and lower bound results. http://www.biomedcentral.com/1471-2105/7/S4/S6/figure/F1, 2015. [Online; accessed 16-June-2015].

[2] Dartmouth. Department of computer science - lecture 18. http://www.cs.dartmouth.edu/ ~thc/cs10/lectures/0505/0505.html, 2015. [Online; accessed 16-June-2015].

[3] Inderjit Dhillon, Yuqiang Guan, and Brian Kulis. A fast kernel-based multilevel algorithm for graph clustering. In Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, pages 629–634. ACM, 2005.

[4] Chris HQ Ding, Xiaofeng He, Hongyuan Zha, Ming Gu, and Horst D Simon. A min-max cut algorithm for graph partitioning and data clustering. In Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on, pages 107–114. IEEE, 2001.

[5] Shantanu Dutt. New faster kernighan-lin-type graph-partitioning algorithms. In Computer-Aided Design, 1993. ICCAD-93. Digest of Technical Papers., 1993 IEEE/ACM Interna-tional Conference on, pages 370–377. IEEE, 1993.

[6] Miroslav Fiedler. Algebraic connectivity of graphs. Czechoslovak mathematical journal, 23(2):298–305, 1973.

[7] geeksquiz. Graph shortest paths. http://geeksquiz.com/algorithms/ graph-shortest-paths/, 2015. [Online; accessed 16-June-2015].

[8] George Karypis and Vipin Kumar. Metis-unstructured graph partitioning and sparse matrix ordering system, version 2.0. 1995.

[9] George Karypis and Vipin Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on scientific Computing, 20(1):359–392, 1998.

[10] George Karypis and Vipin Kumar. Multilevelk-way partitioning scheme for irregular graphs. Journal of Parallel and Distributed computing, 48(1):96–129, 1998.

[11] George Karypis and Vipin Kumar. A parallel algorithm for multilevel graph partitioning and sparse matrix ordering. Journal of Parallel and Distributed Computing, 48(1):71–95, 1998.

[12] Amy N Langville and Carl D Meyer. Google’s PageRank and beyond: The science of search engine rankings. Princeton University Press, 2011.

[13] Md Mostofa Ali Patwary, Rob H Bisseling, and Fredrik Manne. Parallel greedy graph matching using an edge partitioning approach. In Proceedings of the fourth international workshop on High-level parallel programming and applications, pages 45–54. ACM, 2010.

[14] PENCARIAN. Metode pencarian dan pelacakan. https://aiukswkelasgkelompok7. wordpress.com/metode-pencarian-dan-pelacakan/, 2015. [Online; accessed 16-June-2015].

(40)

[15] recibe. Discovering epistemological axes for academic programs in computer science through network analysis. http://recibe.cucei.udg.mx/revista/es/vol1-no1/computacion02. html, 2015. [Online; accessed 16-June-2015].

[16] STANFORD. Stanford Large Network Dataset Collection. http://snap.stanford.edu/ data/index.html, 2015. [Online; geopend 15 April 2015].

[17] @tutorialhorizon. Graph representation adjacency matrix and adjacency list. http://algorithms.tutorialhorizon.com/ graph-representation-adjacency-matrix-and-adjacency-list/, 2015. [Online; accessed 16-June-2015].

[18] VLSI. Practical problems in vlsi physical design. http://users.ece.gatech.edu/limsk/ book/slides/pdf/KL-partitioning.pdf, 2015. [Online; accessed 16-June-2015].

[19] Wikipedia. Network theory — wikipedia, the free encyclopedia. https://en.wikipedia. org/w/index.php?title=Network_theory&oldid=663504157, 2015. [Online; accessed 16-June-2015].

Partitioning of Big Graphs

Bachelor Informatica