Characterizing problems for realizing policies in self-adaptive and self-managing systems

(1)

by

Sowmya Balasubramanian BSc, University of Madras, 1997 MCA, Madurai Kamaraj University, 2001

A Thesis Submitted in Partial Fulfillment of the Requirements for the Degree of

Master of Science

in the Department of Computer Science

c

� Sowmya Balasubramanian, 2013 University of Victoria

(2)

Characterizing Problems for Realizing Policies in Self-Adaptive and Self-Managing Systems

by

Sowmya Balasubramanian BSc, University of Madras, 1997 MCA, Madurai Kamaraj University, 2001

Supervisory Committee

Dr. Hausi A. M¨uller, Supervisor (Department of Computer Science)

Dr. Ulrike Stege, Departmental Member (Department of Computer Science)

(3)

Supervisory Committee

Dr. Hausi A. M¨uller, Supervisor (Department of Computer Science)

Dr. Ulrike Stege, Departmental Member (Department of Computer Science)

Abstract

Self-adaptive and self-managing systems optimize their own behaviour accord-ing to high-level objectives and constraints. One way for human administrators to eﬀectively specify goals for such optimization problems is using policies. Over the past decade, researchers produced various approaches, models and techniques for policy specification in diﬀerent areas including distributed systems, communication networks, web services, autonomic computing, and cloud computing. Research chal-lenges range from characterizing policies for ease of specification in particular ap-plication domains to categorizing policies for achieving good solution qualities for particular algorithmic techniques.

The contributions of this thesis are threefold. Firstly, we give a mathematical formulation for each of the three policy types, action, goal and utility function poli-cies, introduced in the policy framework by Kephart and Walsh. In particular, we introduce a first precise characterization of goal policies for optimization problems.

(4)

Secondly, this thesis introduces a mathematical framework that adds structure to the underlying optimization problem for diﬀerent types of policies. Structure is added either to the objective function or the constraints of the optimization problem. These mathematical structures, imposed on the underlying problem, progressively increase the quality of the solutions obtained when using the greedy optimization technique. Thirdly, we show the applicability of our framework through case studies by ana-lyzing several optimization problems encountered in self-adaptive and self-managing systems, such as resource allocation, quality of service management, and Service Level Agreement (SLA) profit optimization to provide quality guarantees for their solutions. Our approach combines the algorithmic results by Edmonds, Fisher et al., and Mestre, and the policy framework of Kephart and Walsh. Our characterization and approach will help designers of self-adaptive and self-managing systems formulate optimization problems, decide on algorithmic strategies based on policy requirements, and reason about solution qualities.

(5)

List of Tables

Table 2.1 Mapping of objective functions and constraint classes to solution qualities/policy types for the greedy strategy. . . 16

(8)

List of Figures

Figure 1.1 Action, Goal and Utility Function Dartboards . . . 8

Figure 1.2 Assuring the Limits [27] . . . 9

Figure 3.1 Set system (U,_{F) with F = {∅, {1}, {2}, {3}, {1, 2}, {2, 3}} forms} a matroid as it satisfies the downward-closure and the augmen-tation properties. . . 21

Figure 3.2 Set system (U,F) is not a matroid as it does not satisfy the downward-closure property: B =_{{2, 3} ∈ F but A = {3} /}_{∈ F. .} 22

Figure 4.1 Revenue of a Schedule . . . 33

Figure 4.2 Set system (U,_{F) is 2-extendible. . . .} 35

Figure 4.3 Unrestricted Processing Times . . . 36

Figure 4.4 Equal Processing Times . . . 37

Figure 5.1 Scheduling on a Distributed Set of Clouds . . . 43

(9)

Acknowledgements

I am greatly indebted to my supervisor, Dr Hausi M¨uller, for his valuable men-toring, financial support and extreme patience. I would like to thank him for guiding the work done in this thesis by always responding to my thoughts and ideas with lot of optimism and enthusiasm. He is a person who has many roles and responsibilities in the university but I was amazed by how he made time not only for me but for each of his students. He is the best supervisor any student can have and I consider myself very lucky to have him as my supervisor. I have enjoyed every moment of being a part of his research group and I am thankful to him for ensuring such a warm and supportive work environment.

My special thanks go to Dr Ulrike Stege for her help and constant encouragement while collaborating with me on the work done in this thesis. Her positive approach and feedback to the ideas in this thesis made our research meetings lot of fun. She has a very special quality of making people around her feel comfortable and at home. I am also grateful to her for agreeing to serve as a member of my supervisory committee.

I would also like to acknowledge the financial support I received from the Depart-ment of Computer Science at the University of Victoria. The friendly staﬀ members in the Department of Computer Science deserve a special mention for helping me many times during my studies. I would like to thank all the fellow graduate students of the RIGI group for being such a friendly bunch of people to work with.

Finally, I am grateful to my family members for standing by me during the good and hard times in this journey.

Don’t find a fault, find a remedy Henry Ford

(10)

Dedication

To the five most important people in my life... My son Tharun — for his hugs, love and smiles!

&

My Husband — for everything — since my words will not do justice to the extent of help and support I received from him.

&

My Mom, Dad, Sister — for being there for me no matter what.

Being deeply loved by someone gives you strength, while loving someone deeply gives you courage

(11)

Introduction

The ever-growing complexity and sophistication of software systems and the con-stantly evolving and dynamic nature of their environments has led software engineers to explore new ways of designing systems and devices. An important direction emerg-ing over the past decade is the design of self-adaptive systems. Such a system con-stantly adjusts its behaviour at run-time in response to its perception of its environ-ment and its own state in the form of fully or semi-automatic self-adaptation [15, 37]. While some self-adaptive systems can function without human intervention, many of them do require guidance from system administrators to optimize quality of service (QoS) properties. Such high-level objectives, often expressed in the form of policies, tune the self-* operations of such a system [19]. We discuss these operations in more detail in Chapter 2. Policies enable the system to perform appropriate actions and change system behaviour at run-time through high level policy modification. To give an example, in location-based services, an emerging area in mobile commerce, services to be delivered to customers are based on prior knowledge of their profiles and the amount of sensitive information that can be revealed to providers. These services are controlled by security and privacy policies dictated by customers.

(12)

1.1 Related Work

Policy-based systems span a wide range of application domains including autonomic communication [11], privacy and security management [2, 25, 6], autonomic comput-ing [18, 21], provisioncomput-ing in computcomput-ing systems [17], grid and cloud computcomput-ing [16, 24], service oriented systems [26, 46], and smart web services [9]. In a standard architec-tural view of autonomic systems, such systems consist of functional and management components [11]. It is the responsibility of the management components to monitor and influence the behaviour of the functional components using autonomic communi-cation. In autonomic communications, several trust and security concerns arise. To address such concerns, security tasks are performed by the management units with the help of system policies, particularly the security policies. In mobile commerce, location based services aim to provide customers with point of need personalized in-formation based on their current location, time and profile. Such inin-formation could include location based tourist information or location based advertising. However, this requires the mobile service providers to pass on sensitive information about the customer such as their age and salary to vendors. Security policies, based solely on customer preferences, need to be enforced to avoid any breach of security and pri-vacy [2]. Schemes for addressing such security issues are of two types — Trusted Third Party (TTP)-based and TTP-free. Solanas et al. point out that many of the TTP-based schemes are policy-based [40].

In an autonomic system, the autonomic manager monitors an element using data from the sensors and executes changes using the eﬀectors. The changes to be exe-cuted by the autonomic manager are configured by human administrators in the form of high level objectives. Such objectives are described in the form of policies [18]. In Dryad, a cloud computing framework proposed by Microsoft Research, every ap-plication is modelled as a Directed Acyclic Graph (DAG) representing its data flow;

(13)

vertices represent programs and edges represent the data channel. The execution of a Dryad job is handled by a job manager. It is the responsibility of the job manager to transform the job graph based on user-supplied policies dynamically [16]. Many multimedia applications, such as video-on-demand, have soft real-time QoS require-ments as the applications are still considered functionally correct even if it does not meet the given requirements. Lutfiyya et al. develop a policy-based framework for QoS management in such applications [26]. The advantage of this approach, in con-trast to other techniques, is that it does not require users to have detailed knowledge of the resource needs in advance. To give an example, the user or the developer is not expected to know the number of video frames that need to be delivered per sec-ond by an application. Instead, such details are embedded in the policy framework. Policy-based networks also play a key role in managing and ensuring QoS properties by optimizing the use of resources to meet various user needs [31, 39]. In particular, a policy instructs a network node on how to manage requests for network resources. It is essentially a mechanism for encoding business objectives concerning the proper use of scarce resources. Policies define (on a per user/group/application/time-of-day basis) what resources a given consumer can use in a given scenario. Emerging policy-based networking (PBN) technologies help IT ensure that network users get the quality of service (QoS), security, and other capabilities they need. Policy-based networks have the intelligence, in the form of business rules, needed to govern network operations. The networks use hardware and software technologies that let IT managers prioritize access to and consumption of network resources. Policy based networks will help or-ganizations classify traﬃc types, determine priorities, and measure and adjust traﬃc flow as needed. New policy servers and directories store user, resource, and policy information. In addition, network management software links hardware to directories and to policy and application servers, tracks resource use, and reroutes or prioritizes

(14)

traﬃc as needed.

1.2 Approach of Kephart and Walsh

In the autonomic computing domain, diﬀerent approaches and techniques have been proposed over the last decade to create policy frameworks (e.g., [7, 10, 41] to cite just a few). Given the broad spectrum of disciplines that are brought together with the notion of autonomic computing, any possible attempt at defining policy for such systems needs to be very broad. An approach that is most relevant to our work is the unifying policy framework created by Kephart and Walsh [23]. Subsequently, this framework was also used by Kephart and Das to study the role of utility function policies for self-management [22]. More recently, Bahati et al. created a framework that relies on reinforcement learning to define policy sets that meet diﬀerent perfor-mance objectives [3, 4]. This learning approach uses past experience with policies to propose changes to meet performance requirements.

Kephart and Walsh’s approach to classifying policies is based on the AI concept of rational agents — reflex agents as well as goal and utility function based agents [36]. As a result, their framework features three diﬀerent types of policy sets — action, goal, and utility-function policies — to solve optimization problems encountered in the realm of self-managing systems [23]. Action policies focus on “What to do?” and directly specify the actions to be performed in the current state as recommended by rational behaviour. Goal policies focus on “What do we desire?” and specify a single state or a set of desired states. Utility-function policies focus on “What is the best choice?” and ask for a state with highest utility value. While the three policy types diﬀer in the level of specification, every implementation of each such policy always involves a sequence of actions using an algorithmic strategy [20].

(15)

Autonomic computing can be viewed as policy based self-management. In any autonomic system, individual autonomic elements can be considered as solving op-timization problems at the lowest level in the autonomic computing reference archi-tecture (ACRA) [20, 29]. When studying the three types of policies in the light of optimization problems, a natural question arises: “What exactly do the three policy types correspond to when we solve optimization problems at higher levels of goal management?” We propose the following answer:

• A utility function policy for an optimization problem asks for “the best quality or an optimal solution.” This can be interpreted as targeting the state with the highest utility value.

• A goal policy for an optimization problem asks for “a good quality solution or a close approximation to the optimal solution.” In other words, the set of desirable states will include the ones with a utility value comparable to (but not necessarily equal to) the best quality solution. The notion of being a close approximation to the optimal solution is made more precise in Chapter 3. • An action policy in this setting asks for “a best possible choice at every stage.”

In other words, it recommends a local action that is optimal among all available local choices.

The first step of Kephart and Walsh’s approach is to formulate the optimization prob-lem to be solved [23]. Subsequently, the policy author designs algorithms to solve the problem to meet the diﬀerent policy specification. The algorithms progressively in-crease in sophistication as the policy set changes from action to goal, and then to utility function policies. This leads to the view that for many optimization problems one needs to design sophisticated optimization algorithms to meet utility function

(16)

pol-icy specifications. Kephart and Walsh use a data center resource allocation problem to illustrate this view.

1.3 Our Approach

Our approach is complementary to the one by Kephart and Walsh and aims to answer the following research question: “Is it possible to provide mathematical structure for either the objective function or the constraints of an optimization problem so that a simple algorithmic strategy can produce solutions with guaranteed qualities and hence meet the requirements of goal or utility function policies?” In this thesis we show that it is indeed possible.

We illustrate our approach using the greedy technique. The reason for focusing on this technique is twofold: (1) self-adaptive and self-managing systems are complex and can benefit from simple, yet powerful algorithmic strategies that are easy to comprehend and implement; (2) there are similarities between action policies and the greedy technique [34]. Namely, in action policies, the author deems that in the current state, the recommended action is more desirable than alternate actions and hopes the action to be good with respect to the global solution (which may not always be the case). Along the same lines, the greedy technique makes local optimal choices hoping that this will lead to a globally optimal solution.

We oﬀer a dartboard as a metaphor to explain how we add mathematical structure to optimization problems encountered in self-adaptive and self-managing systems. The dartboard represents the entire solution space. Individual pixels on the dartboard represent individual solutions. Throwing a dart corresponds to hitting a pixel and thus picking a solution. Some of the solutions are better than others and some are even optimal. The dartboard that corresponds to action policies has no structure. Thus,

(17)

when a player throws a dart at an action dartboard, he or she will arbitrarily get a good or a bad solution regardless of how good the player is at darts. A goal dartboard has some regions that are delineated by metal frames as is customary in a real dartboard. While there are multiple regions, there is a region that corresponds to high quality solutions (including best ones). An experienced darts player will aim for the region containing the high quality solutions. Finally, the utility-function dartboard, besides the regions already discussed, contains regions containing only optimal solutions. Of course, the smart and skilled darts player will aim for the regions containing only optimal solutions. Figure 1.1 illustrates this metaphor.

In this thesis, we introduce a mathematical framework that adds structure to the underlying optimization problem for diﬀerent types of policies. Structure is added ei-ther to the objective function or to the constraints of the optimization problem. These mathematical structures, imposed on the underlying problem, progressively increase the quality of the solutions obtained using the greedy optimization technique. Our approach is based on the algorithmic results by Edmonds [12], Fisher et al. [13] and Mestre [28], and the policy framework by Kephart and Walsh [23].

Assurance for self-adaptive systems. Our framework is related to a view ex-pressed by Jeﬀ Magee during a panel discussion at SEAMS 2011. In his talk “Assuring the Limits in Self-Adaptable Systems [27]”, he started by discussing three types of states—unsafe, safe and optimal (cf. Figure 1.2). He defined unsafe states as those that do not guarantee any worst case quality, safe states as those guarantee a worst case limit or bound and finally optimal states as those that adaptable systems must ultimately aim to achieve. These states are related to action, goal and utility function policies in our thesis. As an example to illustrate his view, he discussed the prob-lem “Decentralised Coordination” and refers to the work by Rogers et al. [35] which

(18)

Legend

Optimal Solution Good solution, with quality guarantee

Solution without quality guarantee

Action Dartboard

Legend

Goal Dartboard

Legend

Utility Function Dartboard

(19)

designed an approximation algorithm for this problem and proved a bound on the ap-proximation ratio. Therefore, in his view, their work guaranteed a worst case bound on the global utility function. In summary, his opinion was that assurance cannot be an emergent property of self-adaptive systems. In fact, it must be guaranteed by construction during the design process. That is the approach we take in this thesis.

Worst-Case bound

optimal safe unsafe

Figure 1.2: Assuring the Limits [27]

In our framework, by gradually introducing structure to the underlying optimization problem, the problem is engineered in a way that the worst case bounds are assured at design time rather than at run time. We would like to point out that, in contrast to the view of Jeﬀ Magee, we do not consider action policies as unsafe in this thesis. While action policies could be considered unsafe in safety-critical systems, many ap-plications have only soft quality requirements as already pointed out by Lutfiyya et al. [26]. In such applications, even action policies will result in functionally correct systems.

(20)

1.4 Organization of the Thesis

Chapter 2 gives an overview of autonomic systems and in particular, the need for self-optimization. It then describes a generic handbook for designing policy-driven optimization strategies for self-adaptive systems and introduces the greedy technique in a generic setting.

Chapters 3 and 4 present the underlying mathematical structures for our ob-jective function and constraint based frameworks and apply them to the four op-timization problems in the realm of adaptive systems: (1) Resource Allocation in Distributed Systems, (2) Resource Allocation for QoS Management, (3) Data Center Based Scheduling, and (4) Service Level Agreement (SLA) Profit Optimization.

In Chapter 5, we further validate the usefulness of our approach by applying it to two real-world problems: (1) Scheduling to a Distributed Set of Clouds, and (2) Resource Allocation for Clustered Web Farms.

Chapter 6 discusses how our results and our mathematical framework are very relevant to industrial practice even though it assumes the presence of structure in the underlying optimization problem. It concludes the thesis and outlines ideas for future work.

Remark. A preliminary version of this thesis appeared in the Proceedings of the ACM/IEEE 6th International Symposium on Software Engineering for Adaptive and Self-Managing Systems (SEAMS), pp. 70-79, 2011 [5].

(21)

Chapter 2 Overview

In this chapter, we start by describing autonomic systems and some of their desirable properties. Our work in this thesis focuses on one of the key properties of autonomic systems — self-optimization. Hence, we focus on self-optimizing systems and describe the high-level vision of our approach using a handbook for policy-driven optimization strategies. We end with a generic description of the greedy technique that is central to this thesis.

2.1 Autonomic Systems

Autonomic computing, a grand vision put forward by IBM in 2001 [20], has become a central area of research over the last decade. Today’s computing systems have become too massive and complex to manage. For example, in corporate-wide systems, there are typically several heterogenous software environments and applications each of which runs into millions of lines of code and requires skilled IT professionals to install, configure, integrate and maintain. We are now reaching a threshold where the complexity of these systems are approaching the limits of human capability [18]. A possible solution to this software management crisis is to build autonomic systems.

(22)

Autonomic systems are computing systems that can manage themselves given high-level objectives from the administrators [18]. These systems are inspired by the autonomic nervous system in the human body that manages several vital functions such as heart rate and blood pressure without requiring any conscious intervention of the brain. To give a simple example, major operating systems such as Mac OS X are able to configure themselves without almost any input from the user at installation and operation time.

2.1.1 Characteristics of Autonomic Systems

There are four important properties that autonomic systems commonly implement [18, 20]:

1. Self-Configuration. An autonomic system needs to be able to reconfigure it-self and its various components seamlessly under unpredictable conditions. Such reconfigurations can range from being user-based to being automatic, based on monitoring and feedback loops. At design time, a system should be made configurable using feature capabilities such as separation of concerns, level of indirection, integration mechanisms, scripting layers, plug-and-play, and set-up wizards.

2. Self-Optimization. The system needs to continually monitor and tune its resources and operations in order to meet the ever-changing needs of the ap-plication environment. Its needs to optimize its operations in order to improve its performance with respect to a set of prioritized requirements. During sys-tem design, capabilities such as re-partitioning, re-clustering, load balancing or re-routing are incorporated to provide self-optimization.

(23)

3. Self-Healing. The system needs to be able to detect, diagnose and recover from extraordinary events that may cause some of its parts to malfunction. During system design, rules must be provided on how to select and move from the current configuration to an alternative safe configuration with minimal delay and loss of information.

4. Self-Protection. The system needs to protect itself by detecting and coun-teracting threats using built-in protective mechanisms. System design must include a comprehensive analysis of various possible attacks so that capabilities to proactively recognize and handle diﬀerent kinds of threats can be incorpo-rated. It needs to use early warning signs to anticipate and prevent system level failure.

2.1.2 Self-Optimizing Systems

This thesis focuses on the self-optimizing properties of an autonomic system. Ac-cording to the Merriam-Webster’s online dictionary, the word “optimization” stands for “an act, process, or methodology of making something (as a design, system, or decision) as fully perfect, functional, or eﬀective as possible.” In other words, the goal of optimization is to find the most eﬀective way of doing something. A system is said to be self-optimizing if it can adapt and optimize its own behaviour at run-time, as a response to modifications to its environment, to achieve certain objectives. Building such optimizing systems requires a set of rules to follow that clearly describe how the system should detect changes to its environment, analyze those changes to understand their influence on the underlying requirements and how it should autonomously adapt its behaviour to achieve those requirements. Human administrators use policies to specify the rules for self-optimizing systems to follow.

(24)

2.2 A Handbook for Designing Policy-Driven

Op-timization Strategies

Engineers who design policies for a self-adaptive or self-managing system have an optimization problem to solve and a policy level to meet. For example, with the advent of service oriented architecture (SOA), organizations now use distributed ser-vices oﬀered by third party providers for complex applications. These serser-vices are provided by large data centers sharing available resources. Such service providers sign SLAs with their clients that specify costs and penalties associated with vari-ous performance levels. As an example, a data center might consist of a number of application environments that provide application services to customers. The SLA between the customer and the service provider will specify the average response time for various classes of customers [23]. The goal of the data center is to maximize profits by allocating resources eﬀectively. Zhang and Ardagna designed a resource allocator for a data center with a scheduling policy to maximize the overall profit [46]. For this optimization problem the objective function maximizes the profits subject to the constraints that represent the various SLA conditions.

The policy designer looks for the answer to the following question: “What prob-lem solving strategy is most appropriate to achieve the appropriate policy level (i.e., action, goal, or utility function policies)?” Ideally, given an optimization problem with an objective function and constraints, a designer can simply follow a handbook that points to an appropriate problem solving strategy to fulfill the requirements of the desired policy level.

Two critical components of a strategy for solving optimization problems are the algorithmic technique (e.g., greedy or dynamic programming) and the actual prob-lem formulation (i.e., the objective function and constraints). In this thesis, we focus

(25)

on the greedy technique. In Chapters 3 and 4, using results by Edmonds [12], we illustrate through examples that if the objective function is linear and the constraints form a matroid, then the greedy technique produces an optimal solution. Further-more, using results by Fisher et al. [13] and Mestre [28], we describe suﬃcient problem properties such that the greedy technique produces a close to optimal solution. The mathematical properties include linearity and submodularity of objective functions as well as matroid property and k-extendibility of constraint sets. In particular, we identify solution sets (using mathematical properties) for which we can prove quality guarantees. Thus, our handbook can recommend the following:

• If the objective function is linear and the constraints form a matroid, then utility function policies are most appropriate for this optimization problem.

• If the objective function is submodular and the constraints form a matroid or a k-extendible system, then goal policies are most appropriate.

• If the objective function is linear and the constraints form a k-extendible system, then goal policies are most appropriate.

• In all other scenarios action policies are most appropriate.

Table 2.1 summarizes these findings. The dark-grey shaded cell in the top-left corner depicts the problem properties where a greedy solution yielding an optimal solution exists, suggesting utility function policies. Cells shaded in medium grey depict prob-lem properties where the greedy technique achieves close to optimal solutions and therefore goal policies are most appropriate. In this case, restricting the problem’s objective function from sub-modular to linear or the constraints from k-extendible to matroid will result in optimal solutions. For all other scenarios (i.e., light grey and white areas in the table), action policies are appropriate. Note that the greedy

(26)

_



 constraints

objective

function linear submodular unrestricted

optimal/ 1

2-approximation no guarantee

matroid utility function goal action

1

k-approximation 1

k+1-approximation no guarantee

k-extendible goal goal action

no guarantee no guarantee no guarantee

unrestricted action action action

Table 2.1: Mapping of objective functions and constraint classes to solution quali-ties/policy types for the greedy strategy.

technique for problems with properties that fall into light grey cells are only one step away from achieving guaranteed high quality solutions: restricting the problem’s ob-jective function to submodular, or its constraints to k-extendible, will suﬃce. The problem properties corresponding to the one white cell will require two or more steps for improvement in their solution quality. If neither of our mathematical frameworks fits the underlying optimization problem, then the designer still has the option to test if either the objective function or the constraint set can be restricted to fit one of these, as suggested in Table 2.1.

Chapter 3 discusses how to test the objective function for linearity or submodu-larity, using our objective based framework. Chapter 4 discusses how to test whether the constraints satisfy the matroid or k-extendibility properties, using our constraint based framework.

2.3 The Greedy Technique

In this section, we describe what we mean by the greedy technique in a general setting. In the remainder of the thesis, we describe several computational problems and a greedy algorithm for each of those problems. These algorithms are instantiations of the generic greedy algorithm given below. Our description of the algorithm is as given

(27)

in [8].

Suppose we are given a set system (U,_{F) where U is a finite set and F ⊆ 2}U _is

a family of subsets of U . We are also given an objective function g : 2U _{→ IR}+ _that

associates a positive real value with every subset of U . The goal is to output a subset F _{∈ F that has the maximum value according to g. A natural technique to solve this} problem is the greedy method, in which we start with the empty set S and then add elements of U to S in an iterative fashion making the best choice possible at every step. To ensure that the output is in F, we also establish that this property is true at every intermediate step.

Generic Greedy Algorithm [8] S _{← ∅} A_{← ∅} repeat A← {u ∈ U|S ∪ {u} ∈ F} if A �= ∅ u← argmax u�_∈Ag(S∪ {u �_}) S _{← S ∪ {u}} until A =_∅ Output S Here, argmax u�_∈Ag(S∪{u

�_{}) is a value of u}� _{for which g(S}_∪{u�_{}) attains its largest value.}

How can we prove that the greedy algorithm outputs a high quality solution? It turns out that if the set system (U,F) satisfies a “nice” structural property such as a matroid or a k-extendible system and the objective function g is from a “nice” class of functions such as linear or submodular, then it is possible to give a performance guarantee for the greedy algorithm. We discuss these properties in detail in the

(28)

chapters to follow.

We now describe a simple example of the greedy algorithm for the problem of computing Maximum Spanning Trees in weighted, undirected graphs. The input to the MST problem is a weighted, undirected graph G = (V, E) with positive weight assigned to every edge. V denotes the set of vertices of G and E denotes the set of edges of G. We will also assume that G is connected. That is, there is a path in G between every pair of vertices.

A spanning tree T = (V, E�), E� ⊆ E, is a subgraph of G that is connected and has no cycles. The weight of any spanning tree is the sum of the weights of the edges in it. Note that G may contain many such spanning trees. The goal in the Maximum Spanning Tree problem is to output a spanning tree of maximum weight.

Kruskal gave the following greedy algorithm for this problem: Sort the edges by decreasing order of weight. Start with an empty set S. Consider each edge e in order and add e to S unless this creates a cycle in S. Return S.

In this example, U is the set of edges E (with weights) and F is the collection of all subgraphs G� of G that do not contain a cycle. Such subgraphs are often referred to as forests. The objective function g, for any subset of edges of G, assigns a weight that is the sum of weights of all the edges in it. With this correspondence, it is easy to see that Kruskal’ s algorithm is an instantiation of the generic greedy algorithm.

2.4 Summary

In this chapter, we described autonomic systems and some of their desirable proper-ties. Then, we focused on self-optimizing systems and described the high-level vision of our approach using a handbook for policy-driven optimization strategies. Finally, we gave a generic description of the greedy technique that is central to this thesis.

(29)

Chapter 3 Objective Function Based

Framework

This chapter introduces properties for objective function (i.e., linearity and submod-ularity) and constraint sets (i.e., matroid property). Then, we assume that the con-straint set of an optimization problem is a matroid. We then show how to vary or constrain the unrestricted objective function by adding mathematical structure to it to satisfy submodularity or linear properties. We use two examples to show that if the constraint set is a matroid, and as the structure of the objective function is changed from unrestricted to submodular and then to linear, the quality of solution obtained using the greedy technique satisfies the specifications ranging from action to goal and to utility-function policies.

3.1 Resource Allocation in Distributed Systems

The following resource allocation problem arises naturally in many settings. It is the task of allocating heterogeneous resources to servers with the goal of maximizing the system throughput. Applications of this problem include allocations of file servers to

(30)

workstations in a local network, load balancing in distributed systems, and session allocation in time-shared systems.

Problem 1. (Resource Allocation in Distributed Systems) We are given

• a set V = {1, 2, . . . , m} of m servers, and

• a set R = {1, 2, . . . , l} of l resources (e.g., CPU time, memory, or bandwidth) that are to be assigned to these servers.

Here, we assume that

• there are k resource types such as CPU time, memory, or bandwidth in total. Every such resource type is split into many blocks of fixed size so that one or more such blocks can be assigned to each server. In other words, there are k sets, Rt1, . . . , Rtk, one for each resource type and R =

k

�

j=1

Rtj.

The throughput of a server i, 1 _{≤ i ≤ m, denoted by T}i, is a function Ti : 2R →

IR+_{. The goal is to maximize the sum of the throughputs of the servers,} �m i=1

Ti, subject

to the constraint that every resource is assigned to at most one server.

A utility function policy for this problem produces an optimal allocation that max-imizes the sum of the throughput. A goal policy produces an allocation that compares favourably in quality to the optimal allocation, while an action policy recommends best choice actions based on some local criterion.

A greedy algorithm for the Problem 1 is as follows:

1. Consider all resources from R that have not yet been assigned to a server. 2. Among those, choose a (resource, server)-pair such that the resulting allocation

(31)

3. Repeat until each resource is assigned to a server.

To analyze the quality of the solution produced by this greedy algorithm, we intro-duce the mathematical construct of a matroid. Matroids enable us to characterize the properties of all possible allocations of resources to servers satisfying the given constraint.

Matroids are combinatorial structures that are defined to capture the notion of independence in a general setting [12].

Definition 1 (Matroid [30]). Let U be a finite set and _{F ⊆ 2}U _{be a collection of}

subsets of U . A set system (U,F) is called a matroid if it satisfies the following conditions:

1. F satisfies the downward-closure property: If A ⊆ B and B ∈ F, then A ∈ F. That is, any subset of a member of the collection _{F is also a member of F.}

U =_{{1, 2, 3}} {1,2} _{2,3} _{1,3} {1} _{2} _{3} ∅ F = {φ, {1}, {2}, {3}, {1, 2}, {2, 3}} Members of F

Figure 3.1: Set system (U,_{F) with F = {∅, {1}, {2}, {3}, {1, 2}, {2, 3}} forms a} matroid as it satisfies the downward-closure and the augmentation properties.

2. _{F satisfies the augmentation property: for all A, B ∈ F with |B| > |A|, there} exists an element x _{∈ B − A such that A ∪ {x} ∈ F. In other words, for any}

(32)

choice of two sets A and B that are elements of F such that the size of B is larger than the size of A, then it is possible to move an element x from B to A such that A_{∪ {x} also is in F.}

U =_{{1, 2, 3}} {1,2} _{2,3} _{1,3} {1} _{2} _{3} ∅ F = {φ, {1, 2}, {2, 3}, {1}, {2}} Members of _F

Figure 3.2: Set system (U,F) is not a matroid as it does not satisfy the downward-closure property: B ={2, 3} ∈ F but A = {3} /∈ F.

We use the structures depicted in Figures 3.1 and 3.2 to illustrate the verification of the two matroid properties downward-closure and augmentation. In particular, for Figure 3.1:

1. _{F satisfies the downward-closure property. Every subset of a member of F is} also in F.

2. F satisfies the augmentation property. To verify this property, we have to check all possible choices of sets A and B such that |B| > |A|. We illustrate this by one of the choices: A = {1} and B = {2, 3}. We leave the other cases to the reader. In this case, there exists one element x = 2, x _{∈ B − A, that can be} added to A. The resulting set _{{1, 2} is in F.}

(33)

Hence (U,F) in Figure 3.1 is a matroid. In contrast, the set system depicted in Fig-ure 3.2 is not a matroid because _{F is not downward closed. The set {2, 3} belongs to} F but {3} does not.

We now show that for Problem 1, the set of all feasible allocations of resources to servers satisfying the given constraint forms a matroid.

Verification for Problem 1. To view the allocation of resources to servers as a set system (U,F), we define the underlying universe set U = {1, 2, . . . , l} × {1, 2, . . . , m} where any element of set U is a pair (i, j) with 1 ≤ i ≤ l and 1 ≤ j ≤ m. We interpret the choice of such an element as “resource i is assigned to server j”. Then, the set of all possible such allocations _{F, satisfying the constraint that each resource} is assigned to at most one server, is clearly a collection of subsets of U . We show that the set system (U,_{F) forms a matroid.}

• Downward Closure: Let A be any allocation of at most l resources to m servers in _{F. If A satisfies the given constraint, then any “sub-allocation” B, B ⊆ A,} also satisfies the constraint and must be a valid allocation of the resource to the servers satisfying the constraints.

• Augmentation: Consider two allocations A and B of resources to the m servers inF such that |B| > |A|. Then, since B contains more elements than A, there exists x = (r, s), x _{∈ B − A, such that resource r does not belong to a tuple} in A. Then, we can add tuple (r, s) to set A to get a valid allocation. That is, A_{∪ {(r, s)} ∈ F.}

Having established that the set of all feasible allocations forms a matroid, we now fo-cus on the nature of the objective function. We show that, as we introduce additional structure to the objective function, the quality of the solution produced by the greedy

(34)

algorithm improves. In general, an objective function is a function g : 2U _{→ IR}+

(see p.16). For example, in the resource allocation problem, the objective function T = �m

i=1

Ti takes any allocation and outputs the total throughput (positive real value).

We use the function concepts of submodular and linear to characterize diﬀerent classes of objective functions. We call an objective function submodular if it satisfies the following property.

Definition 2 (Submodular Function). For a given set U , function g : 2U _{→ IR}+

is called submodular if g satisfies the following property: g(A _{∪ B) + g(A ∩ B) ≤} g(A) + g(B) for all A, B ⊆ U.

This property, satisfied by submodular functions, is also referred to as property of diminishing returns [13, 38].

Example of a submodular function. Let U = {1, 2, 3}. Then objective function g defined on 2U _{by g(}_{∅) = 0, g({1}) = 1, g({2}) = 3, g({3}) = 2, g({1, 2}) = 3,}

g(_{{2, 3}) = 3, g({1, 3}) = 3, g({1, 2, 3}) = 3 is submodular. To show this, one} can verify that the property of diminishing returns holds as the inequality is true for all choices of A and B. We do it for one choice: A = _{{1} and B = {2}.} g(A_{∪ B) + g(A ∩ B) = 3 + 0 = 3. g(A) + g(B) = 1 + 3 = 4. Hence the} prop-erty holds.

Linear objective functions are another common and important class of functions for optimization problems. We call an objective function linear if it satisfies the following property.

Definition 3 (Linear Function). For a given set U , a function W : 2U _{→ IR}+ _is

called linear if, for any F _{⊆ U, W (F ) =} �

s∈F

w(s) for some fixed underlying weight function w : U _{→ IR}+_.

(35)

Any linear function is also a submodular function. To prove this, we must show that a linear function satisfies the inequality in Definition 2.

LHS = W (A_{∪ B) + W (A ∩ B)} = Σ_s∈A∪Bw(s) + Σ_s∈A∩Bw(s) = (Σs∈A−Bw(s) + Σs∈A∩Bw(s)) + (Σs∈B−Aw(s) + Σs∈A∩Bw(s)) = Σ_s∈Aw(s) + Σ_s∈Bw(s) = W (A) + W (B) = RHS

Example of a linear function. Let U = _{{1, 2, 3}. Let w be a weight function} defined on U by w(1) = 0, w(2) = 1 and w(3) = 2. Further let weight function W be defined on 2U _{by W (φ) = 0, W (}_{{1}) = 0, W ({2}) = 1, W ({3}) = 2, W ({1, 2}) = 1,}

W ({2, 3}) = 3, W ({1, 3}) = 2 and W ({1, 2, 3}) = 3. Then, W is linear because, for any subset of U , W is the sum of the weights of all the elements of the subset.

To formalize the expectations of a goal policy, we introduce the concept of an ap-proximation algorithm. The notion of an apap-proximation algorithm gives us a way to measure the quality of the solution produced by such an algorithm with respect to its optimal solution. In particular, we give here the definition for maximization problems.1

Definition 4. (Approximation Algorithm [43]).

An algorithm A for a maximization problem P is said to be a ρ-approximation al-gorithm if for any instance x of P , the value of the objective function on the output of A, denoted by A(x), is at most a factor ρ away from the value of the objective

(36)

function for the best possible solution, denoted by OP T (x). That is,

A(x)

OP T (x) ≥ ρ.

For example, if on all input instances x, A(x) ≥ OP T (x)/2, we say that A is a 1/2-approximation algorithm. Note that many optimization problems encountered in practice are computationally NP-hard [14]. For such problems, it is highly unlikely that we will be able to design an algorithm running in polynomial time that produces an optimal solution, and approximation algorithms are highly desirable.

Having introduced the concepts of approximation algorithm as well as submodular and linear objective functions, we can now state our three results for Problem 1 concisely.

1. Unrestricted objective function: If the objective function of Problem 1 is unrestricted, then there are no theoretical guarantees on the performance of the greedy algorithm. For example, let Tm be defined as Tm(A) = |Am|2

where A, A ⊆ U, is an allocation and Am ⊆ A contains all the tuples in A

assigned to server m. Then, the greedy algorithm may or may not give good quality solutions. Hence, the greedy algorithm in this case can only satisfy the requirements of action policies.

2. Submodular objective function: If we restrict the objective function to be submodular, then the greedy algorithm gives a 1₂-approximation algorithm using a Theorem shown by Fisher, Nemhauser and Wolsey [13]. More formally: Theorem 1 (Fisher, Nemhauser and Wolsey [13]). Let (U,F) be a matroid and let g : 2U _{→ IR}+ _{be a monotone, submodular function. Then the greedy}

algorithm yields an 1

(37)

In other words, the solution produced by the greedy algorithm is guaranteed to have a throughput that is at least half as good as the throughput of the optimal solution. Hence, it satisfies the requirements of goal policies.

To give an example of a submodular function, let t(r, s), for t : U → IR+_,

give the throughput obtained when resource r is assigned to server s. For any server i and A_{⊆ U, let T}i(A) be defined as min( �

a∈Ai

t(a), Bi) where Bi > 0 is a

threshold value. It can be shown that Ti’s (and hence T ) are submodular.

3. Linear objective function: If we further restrict the objective function to be linear, then the greedy algorithm produces an optimal solution [12]. Formally,

Theorem 2 (Edmonds [12]). Let (U,_{F) be a matroid and w : U → IR}+ _{be a}

positive weight function on elements of U . Then the greedy algorithm returns an optimal solution to the optimization problem max

F∈F W (F ) where W (F ) =

�

s∈F

w(s) for any F ∈ F.

Thus, we conclude that the greedy algorithm satisfies the requirements of util-ity function policies. As an example of a linear function, let t(r, s) give the throughput obtained when resource r is assigned to server s. For any server i and A ⊆ U, let Ti(A) be defined as �

a∈Ai

t(a). Then, Ti’s (and hence T ) are

linear.

To illustrate the main results from our objective function based framework further, Section 3.2 discusses another resource management problem.

(38)

3.2 Resource Allocation for QoS Management

Quality of service (QoS) issues arise in several areas, including communication net-works, distributed systems, service oriented systems, and real-time systems. All these involve strategies to allocate suﬃcient amounts of resources to the various applica-tions that are running concurrently, in order to satisfy various QoS requirements. Typical QoS parameters include quality, reliability, security, or timeliness. Rajkumar et al. studied the following QoS resource-allocation problem [32, 33].

Problem 2. (Resource Allocation for QoS Management) We are given

• a set of applications {A1, A2, . . . , An},

• a set of minimum resource units required {Rmin

1 , Rmin2 , . . . , Rminn } for QoS purposes, and

• total number R available resource units with R ≥ �n

i=1

Rmin i .

Furthermore, we assume that

• each application Ai has an associated weight wi specifying its relative

impor-tance.

• for each application Ai, a utility function Ui is given that depends on the resource

allocated to that application.

• every application Ai must be given at least its minimal resource requirement

Rmin i .

The goal is to divide the given resource R among the n applications into_{R1, R2, . . . , Rn}

so that the total utility _{U, U =}

n

�

i=1

(39)

A greedy algorithm for this problem is as follows:

1. Allocate the minimum resource Rmin

i to application Ai for 1≤ i ≤ n.

2. Assign one unit of additional resource to an application so that the resulting allocation has the largest increase of U.

3. Repeat step (2) until E = R₋�

i

Rmin

i units of excess resource are allocated.

We show that the set system for Problem 2 forms a matroid, where F is the set of all feasible allocations of E units of excess resources to n applications.

Verification for Problem 2. To view the allocation of the excess resource to various applications as a set system, we define the underlying universe set U =_{{1, 2, . . . , E}×} {1, 2, . . . , n}. Then, any element of set U is a pair (i, j), 1 ≤ i ≤ E and 1 ≤ j ≤ n. We interpret this choice as “the ith _{unit of excess resources is allocated to application j”.}

Then, the set of all possible such allocations F satisfying the constraints described above is clearly a collection of subsets of U . We now explain why the pair (U,F) forms a matroid.

• Downward closure: Let us choose any feasible allocation A of at most E units of the excess resource to the n applications. If A is feasible, then any “sub-allocation” B, B _{⊆ A, must also be a feasible schedule of the resource to the} applications.

• Augmentation: Consider two feasible allocations A and B of at most E units of excess resource to the n applications such that |B| > |A|. Then, since the size of B is bigger than that of A, there is a unit resource k that was allocated to an application At in B but not in A. Since resource k was previously not

(40)

assigned to any application in allocation A, we can add the tuple (k, At) to A

and we get a new feasible schedule.

Since the set of constraints forms a matroid, we can state our results for Problem 2 in the objective function framework:

1. Unrestricted objective function: For a unrestricted, objective function _U in Problem 2, there is no theoretical guarantee for the quality of the solution produced by the greedy technique. Hence, the greedy technique can only satisfy the requirements of action policies.

2. Submodular objective function: If _{U is submodular, then the greedy} algo-rithm gives a 1₂-approximation by Theorem 1. Therefore, the greedy algorithm satisfies the expectations of goal policies.

3. Linear objective function: Finally, if we further restrict_{U to be linear, then} the greedy algorithm produces an optimal solution by Theorem 2, and hence matches the needs of utility function policies.

Remarks. The results that we have shown have two interesting implications for the work by Rajkumar et al. [32, 33].

• They study a class of objective functions called min-linear-max functions and point out that these functions are very useful and appropriate in many scenar-ios [33]. Min-linear-max functions are submodular because they are constructed as follows: We start with a weight function w, w : U → IR+_{. The}

min-linear-max function g for any subset S of U is defined as gB(S) = min(� i∈S

w(i), B) for some B ≥ 0. As previously mentioned, such a function gB, for any B ≥ 0

is a submodular function. Therefore, in this special case of min-linear-max functions, the greedy algorithm produces a 1

(41)

• Another important scenario considered by Rajkumar et al. is the case of lin-ear objective functions with many dependent QoS dimensions [32]. In such a scenario, if an application is given a units of Resource 1, it must be given at least b units of Resource 2. For this case, they designed a greedy algorithm and showed, using an example, that it can be sub-optimal. In our results for Problem 2, we show that for the case of linear objective function coupled with independent QoS dimensions, the greedy algorithm produces an optimal solu-tion. This follows from (3) above as the objective function is linear and the constraint set forms a matroid when QoS dimensions are independent.

3.3 Summary

In this chapter, we introduced important mathematical properties for objective func-tion (i.e., linearity and submodularity) and constraint sets (i.e., matroid property). We also developed the objective function based framework. In this framework, we assumed that the constraint set of an optimization problem is a matroid and showed how to vary or constrain the unrestricted objective function by adding mathematical structure to it to satisfy submodularity or linear properties. We used two examples to show that if the constraint set is a matroid, and as the structure of the objective function is changed from unrestricted to submodular and then to linear, the quality of solution obtained using the greedy technique satisfied the specifications ranging from action to goal and to utility-function policies.

(42)

Chapter 4 Constraint Based Framework

This chapter introduces a new property for constraint sets (i.e., k-extendibility) that generalizes the concept of matroids. We assume that the objective function of an op-timization problem is linear. We then show how to vary or constrain the unrestricted constraint set by adding mathematical structure to it to satisfy k-extendibility or ma-troid properties. We use two examples to show that if the objective function is linear, and as the structure of the constraint set is changed from unrestricted to k-extendible and then to matroid, the quality of solution obtained using the greedy technique satisfies the specifications ranging from action to goal and to utility-function policies.

4.1 Data Center Based Scheduling Problem

Let us consider a scheduling problem as outlined and studied by Mestre [28].

Problem 3. (Data Center Based Scheduling Problem) Given a set of n jobs J1, . . . , Jn, where each Ji has the following parameters:

• Arrival time: Ai

(43)

• Processing time: Pi

• Profit or revenue: Ri

The goal is to schedule jobs on a single server such that the total revenue is maximized. The total revenue of a schedule is the sum of the revenues of the jobs processed in the schedule. We show an example in Fig 4.1.

J₁ Schedule S J₂ J₃ 4 ₅ ₂ R 11 + + =

Figure 4.1: Revenue of a Schedule

The greedy algorithm to solve this problem is as follows:

1. Sort all jobs based on revenue Ri in decreasing order.

2. Start with the empty schedule and add a job to the current schedule if feasible. When job Ji is added, choose and fix a start time ti with

(44)

Verification for Problem 3. The objective function, the total revenue R of a schedule S, is defined as R = �

Ji∈S

Ri. This objective function is linear.

Thus, we focus on the constraint set. Let D = max

i Di denote the deadline by which

the schedule S completes all the jobs chosen. To view the allocation of jobs to the server as a set system, we define the underlying universe set U = _{{1, 2, . . . , n} ×} {1, 2, . . . , D}. Then, any element of U is a pair (i, j) and we interpret this as “Job i will be processed starting from time instant j”. F, the set of all feasible schedules, is a collection of subsets of U . For each (i, j) ∈ F, j satisfies the inequality Ai ≤

j ≤ Di − Pi where Ai is the arrival time of Job Ji, j is the time when Job Ji starts

processing, Di− Pi is the time by which Job Ji must start to meet the deadline Di.

We now introduce the concept of a k-extendible system. We use this concept to provide performance guarantees of a greedy algorithm towards meeting the specifi-cations of goal policies in our constraint based framework. Mestre introduced the concept of a k-extendible system in his study of the performance of the greedy tech-nique as an approximation algorithm [28]. It is useful for understanding the structure within a set of constraints when it is “close” to being a matroid.

Definition 5 (k-Extendible System [28]). Let U be a finite set and _{F, F ⊆ 2}U_{, be}

a collection of subsets of U . Set system (U,_{F) is called a k-extendible system if it} satisfies the following properties:

1. Downward-closure: If A⊆ B and B ∈ F, then A ∈ F.

2. Exchange: Let A, B ∈ F with A ⊆ B, and let x ∈ U − B be such that A ∪ {x} ∈ F. Then there exists Y ⊆ B − A, |Y | ≤ k, such that (B − Y ) ∪ {x} ∈ F. In other words, let us start with any choice of two sets A and B such that B is an extension of A. Suppose that there is an element x such that the set A with x added to it also belongs to _{F. Then we will be able to find a subset Y inside B}

(45)

of size at most k such that if we remove the elements of Y from B and add the element x to the resulting set, it will also belong to the collection _F.

U =_{{1, 2, 3}} {1,2} _{2,3} _{1,3} {1} _{2} _{3} ∅ F = {φ, {1}, {2}, {3}, {1, 2}} Members of_F

Figure 4.2: Set system (U,_{F) is 2-extendible.}

Figure 4.2 shows an example of a 2-extendible system. We will show that _F satisfies the downward-closure and the exchange properties.

1. Downward-closure: As in the matroid example (cf. Fig. 3.1), the reader can verify that F is downward-closed.

2. Exchange: Suppose A = _{∅, B = {1, 2} and x = 3. The conditions A ⊆ B and} A_{∪ {x} ∈ F are satisfied. We set Y = {1, 2}. (B − Y ) ∪ {x} = {3} ∈ F. Note} that selected Y is the smallest possible choice.

Therefore,F is a 2-extendible system.

To remark upon the connection of matroids and k-extendible systems, one can show that matroids are equivalent to 1-extendible systems. In this sense, k-extendible systems for k _{≥ 2 are generalizations of matroids.}

(46)

A special case for Problem 3. Mestre showed that for the job scheduling problem introduced as Problem 3 here, if all the processing times Pi are equal, then the set

of feasible schedules _{F forms a 2-extendible system [28]. The main idea of the proof} is illustrated as follows. Adding a new job Ji at time t to a schedule B that is an

extension of a schedule A can create a conflict with other jobs already scheduled in B. However, since all the jobs have the same processing time, there cannot be more than two jobs in conflict with job Ji. If these two jobs are removed from B and the

new job Ji is added to B with start time t, then it is a feasible schedule.

J₁ J5 J2 J4

Schedule S

J3

J₆

Figure 4.3: Unrestricted Processing Times

Having introduced the concept of a k-extendible system, we are now ready to state our observations for the Data Center Scheduling problem (Problem 3) in the constraint based framework.

(47)

any conditions on the four parameters, Ai, Di, Pi and Ri. Such an example is

shown in Figure 4.3. Then, the set of all feasible schedules is not a k-extendible system for any k, as it violates the exchange property. If we start with a feasible schedule S of jobs and add a new job J, it may not always be possible to bound the number of jobs that need to be removed from S. We may need to remove many jobs with small processing times to be able to include a new job J with a large processing time. Hence, our framework gives no theoretical guarantees for the performance of the greedy algorithm. Therefore, an unrestricted constraint set satisfies the expectations of action policies.

J2 Schedule S J₃ J4 1 3 5 7 J₄

Figure 4.4: Equal Processing Times

2. k-Extendible constraint set: As explained above for the special case for Problem 3, if all processing times Pi are equal, then the set of all feasible

(48)

Mestre showed that for any optimization problem in which the objective func-tion is a linear funcfunc-tion and the constraints form a k-extendible system, the greedy algorithm gives a 1

k-approximation [28]. More formally,

Theorem 3 (Mestre [28]). Let (U,_{F) be a k-extendible system for some k. Let} w : U _{→ IR}+ _{be a positive weight function on U . Then, the greedy algorithm}

gives a 1

k-approximation algorithm for the optimization problem that asks to

determine max

F∈F W (F ) where W (F ) =

�

s∈F

w(s) for any F _{∈ F.}

Applying this result to our problem, the greedy technique provides a 1

2-approximation

when all processing times are equal because the set of feasible schedules forms a 2-extendible system in that case. Therefore, it satisfies the specifications of goal policies.

3. Matroid constraint set: If we further assume unit processing times as shown in Figure 4.4, that is Pi = 1 for all i, then the set of all feasible schedules

forms a matroid as shown by Mestre [28]. The greedy algorithm produces an optimal schedule in this scenario, using the result of Edmonds [12]. Therefore, the quality of the solution matches the requirements of utility function policies.

The scheduling problem studied above is closely related to the Data Center prob-lem studied by Kephart and Walsh [23]. Here, each job Ji is given a release time

ti. The jobs come from two classes, gold and silver—gold jobs have a higher priority

than silver jobs. Characterizing jobs as gold and silver jobs is useful for service level agreements (SLAs). Job Ji is expected to be serviced within response time ri.

Fur-thermore, let us assume that Ji has a processing time Pi, and the deadline for job Ji

(49)

it is also processed before its deadline and vice versa. Thus, we are able to relate our results for the scheduling problem above to Kephart and Walsh’s Data Center problem.

4.2 SLA Based Profit Optimization

Another example that illustrates our results for the constraint based framework is a variant of the SLA based profit optimization problem in autonomic computing systems studied by Zhang and Ardagna [46]. A similar SLA based optimization problem was also studied recently by Zhang et al. [47].

To model this profit optimization problem, the authors view a data center as a distributed system with M clusters, where each cluster consists of many servers. Further, there are K diﬀerent classes of requests. The goal of the scheduler is to assign incoming requests to servers. Each request class has an associated function that gives the revenue (or penalty) gained based on the average response time. This is a part of the service level agreement (SLA). In their paper, one of the constraints requires that at each server, each job class is assigned to exactly one service level. We modify this constraint such that each request class is assigned to at most one service level. Each SLA level can be viewed as assigning a priority to a request class at each server. If no SLA level is allocated to a request class, we assume that the allocator will use a default option for this class.

We now consider the scenario studied in [46]: both the number of servers that are switched on and the load at each server are fixed. Then the optimization problem becomes the following multi-choice binary knapsack problem:

Problem 4. (SLA Based Profit Optimization) We are given

(50)

• n groups of items where group i has ki items

• item j of group i is assigned a value vij, and

• item j of group i requires resources that are represented by its weight wij

The objective is to pick at most one item from each group such that the total weight of this knapsack cannot exceed W and the total value of the collected items is maximized. Note that the objective function is linear as the total value is the sum of the values of the collected items. A greedy algorithm for this problem is as follows:

1. Sort items based on their values vij.

2. Starting with an empty knapsack, add the next item from the sorted list into the knapsack provided the weight constraint of the knapsack is satisfied and no item has been picked from the same group before.

3. Repeat this process until we reach the end of the sorted list or the weight of the knapsack exceeds W .

Our main results for the SLA Based Profit Optimization (Problem 4) in the constraint based framework are:

• Unrestricted constraint set: For the unrestricted case with no conditions on the resource requirements (that is, weights) of individual items, the exchange property is not satisfied for any fixed choice of k. If we consider a feasible collection of items in the knapsack to which we want to add a new item, it may not be possible to bound the number of items to be removed before the new item can be added. Hence the greedy algorithm can only satisfy the expectation of action policies.

(51)

• k-Extendible constraint set: In many scenarios, the weights of the items can satisfy more conditions. For example, suppose that the ratio of the maximum and minimum weights is bounded by k, that is wmax

wmin ≤ k. Then the set of all

possible collections forms a k-extendible system, and the solution produced by the greedy algorithm satisfies the requirements of goal policies.

Zhang and Ardagna designed a solution based on tabu search to solve the optimiza-tion problem [46]. Interestingly, the initial high-quality soluoptimiza-tion needed to start tabu search is obtained using the greedy approach, and is then further improved upon. However, under more structured conditions, we can guarantee that the greedy solu-tion already performs well and provides theoretical guarantees for the quality of the solution.

4.3 Summary

This chapter introduced a new property for constraint sets (i.e., k-extendibility) that generalizes the concept of matroids. We developed the constraint based framework. In this framework, we assumed that the objective function of an optimization problem is linear and showed how to vary or constrain the unrestricted constraint set by adding mathematical structure to it to satisfy k-extendibility or matroid properties. We used two examples to show that if the objective function is linear, and as the structure of the constraint set is changed from unrestricted to k-extendible and then to matroid, the quality of solution obtained using the greedy technique satisfied the specifications ranging from action to goal and to utility-function policies.

(52)

Chapter 5 Case Studies: An Evaluation and

Analysis

In the previous two chapters, we introduced two frameworks and showed how they could be applied to study problems that arise naturally in practice. In this chapter, we demonstrate the usefulness of our approach by applying the ideas developed for the objective function based framework and the constraint based framework to two problems arising in practice. The first problem, recently studied by Agrawal et al., aims to find a deployment strategy of jobs to clouds under a set of constraints with the goal of maximizing the total revenue [1]. The second problem, studied by Tantawi et al., is relevant to service providers who try to find allocation mechanisms that maximize the total throughput of a clustered web farm [42].

5.1 Scheduling on a Distributed Set of Clouds

With cloud computing’s ability to provision resources on-demand, the ability to provi-sion clouds for running a set of jobs efficiently becomes important. This is particularly important in a typical situation where different clouds charge different costs to use

Characterizing problems for realizing policies in self-adaptive and self-managing systems

Supervisory Committee

Abstract

Contents

List of Tables

List of Figures

Acknowledgements

Dedication

Introduction

1.1

Related Work

1.2

Approach of Kephart and Walsh

1.3

Our Approach

1.4

Organization of the Thesis

Chapter 2

Overview

2.1

Autonomic Systems

2.1.1

Characteristics of Autonomic Systems

2.1.2

Self-Optimizing Systems

2.2

A Handbook for Designing Policy-Driven

Op-timization Strategies

2.3

The Greedy Technique

2.4

Summary

Chapter 3

Objective Function Based

Framework

3.1

Resource Allocation in Distributed Systems

3.2

Resource Allocation for QoS Management

3.3

Summary

Chapter 4

Constraint Based Framework

4.1

Data Center Based Scheduling Problem

4.2

SLA Based Profit Optimization

4.3

Summary

Chapter 5

Case Studies: An Evaluation and

Analysis

5.1

Scheduling on a Distributed Set of Clouds