Mobility management and mobile server dispatching in fixed-to-mobile and mobile-to-mobile edge computing

(1)

by

Jingrong Wang

B.Sc., Beijing Jiaotong University, 2017

A Thesis Submitted in Partial Fulfillment of the Requirements for the Degree of

MASTER OF SCIENCE

in the Department of Computer Science

c

Jingrong Wang, 2019 University of Victoria

(2)

Mobility Management and Mobile Server Dispatching in Fixed-to-mobile and Mobile-to-mobile Edge Computing

by

Jingrong Wang

B.Sc., Beijing Jiaotong University, 2017

Supervisory Committee

Dr. Jianping Pan, Supervisor (Department of Computer Science)

Dr. George Tzanetakis, Departmental Member (Department of Computer Science)

(3)

ABSTRACT

Mobile edge computing (MEC) has been considered as a promising technology to handle computation-intensive and latency-sensitive tasks for mobile user equipments (UEs) in next-generation mobile networks. Mobile UEs can offload these tasks to nearby edge servers, which are typically deployed on base stations (BSs) that are equipped with computation resources. Thus, the task execution latency as well as the energy consumption of mobile devices can be reduced.

Mobility management has played a fundamental role in MEC, which associates UEs with the appropriate BSs. In the existing handover decision-making process, the communication costs dominate. However, in edge scenario, the computation capacity constraints should also be considered. Due to user mobility, mobile UEs are nonuniformly distributed over time and space. Edge servers in hot-spot areas can be overloaded while others are underloaded. When edge servers are densely deployed, each UE may have multiple choices to offload its tasks. Instead, if edge servers are sparsely deployed, UEs may only have one option for task offloading. This aggravates the unbalanced workload of the deployed edge servers. Therefore, how to serve the dynamic hot-spot areas needs to be addressed in different edge server deployment scenarios.

Considering these two scenarios discussed above, two problems are addressed in this thesis: 1) with densely deployed edge servers, for each mobile UE, how to choose the appropriate edge servers independently without full system information is inves-tigated, and 2) with sparsely deployed edge servers, how to serve dynamic hot-spot areas in an efficient and flexible way is emphasized. First, with BSs densely de-ployed in hot-spot areas, mobile UEs can offload their tasks to one of the available edge servers nearby. However, precise full system information such as the server workload can be hard to be synchronized in real time, which also introduces extra signaling overhead for mobility management decision-making. Thus, a user-centric reinforcement-learning-based mobility management scheme is proposed to handle sys-tem uncertainties. Each UE observes the task latency and automatically learns the optimal mobility management strategy through trial and feedback. Simulation results show that the proposed scheme manifests superiority in dealing with system uncer-tainties. When compared with the traditional received signal strength (RSS)-based handover scheme, the proposed scheme reduces the task execution latency by about 30%.

(4)

Second, fixed edge servers that are sparsely deployed around mobile UEs are not flexible enough to deal with time-varying task offloading. Dispatching mobile servers is formulated as a variable-sized bin-packing problem with geographic constraints. A novel online unmanned aerial vehicle (UAV)-mounted edge server dispatching scheme is proposed to provide flexible mobile-to-mobile edge computing services. UAVs are dispatched to the appropriate hover locations by identifying the hot-spot areas sequen-tially. Theoretical analysis is provided with the worst-case performance guarantee. Extensive evaluations driven by real-world mobile requests show that, with a given task finish time, the mobile dispatching scheme can serve 59% more users on aver-age when compared with the fixed deployment. In addition, the server utilization reaches 98% during the daytime with intensive task requests. Utilizing both the fixed and mobile edge servers can satisfy even more UE demands with fewer UAVs to be dispatched and a better server utilization.

To sum up, not only the communication condition but also the computation lim-itation have an impact on the edge server selection and mobility management in MEC. Moreover, dispatching mobile edge servers can be an effective and flexible way to supplement the fixed servers and deal with dynamic offloading requests.

(5)

List of Tables

Table 2.1 System parameters in simulation. . . 21 Table 3.1 Definition and notation in Chapter 3. . . 32

(9)

List of Figures

Figure 2.1 RSS from adjacent BSs. . . 11

Figure 2.2 Agent-environment interaction in Q-learning. . . 12

Figure 2.3 The architecture of MEC with UEs. . . 14

Figure 2.4 Transition diagram of task generation. . . 14

Figure 2.5 Latency model of task offloading. . . 17

Figure 2.6 Flow chart of the Q-learning-based mobility management scheme. 20 Figure 2.7 Simulation topology. . . 21

Figure 2.8 Variance of the BS workload. . . 22

Figure 2.9 Average per-task latency. . . 23

Figure 2.10 Average per-UE energy consumption. . . 23

Figure 2.11 Per-UE handover frequency. . . 24

Figure 2.12 Impact of . . . 24

Figure 2.13 Impact of γ. . . 25

Figure 2.14 Impact of α. . . 25

Figure 3.1 Tencent requests distribution on Oct. 1st, 2018 (a national hol-iday in China) at Happy Valley, Beijing. . . 30

Figure 3.2 System model of MEC with UAV-mounted edge servers. . . . 31

Figure 3.3 Illustation of Theorem 3. . . 41

Figure 3.4 The number of associated tasks for each fixed BSs. . . 44

Figure 3.5 Impact on service capacity. . . 44

Figure 3.6 Impact on the dispatched UAVs. . . 45

Figure 3.7 Impact on service fairness index in latency. . . 46

Figure 3.8 Impact on server utilization. . . 47

Figure 3.9 Practical performance of HOLD. . . 47

Figure 3.10 Impact of server capacity on service capacity. . . 48

Figure 3.11 Impact of server capacity on the number of dispatched UAVs. 49 Figure 3.12 Impact of server capacity on service fairness. . . 49

(10)

Figure 3.13 Impact of server capacity on server utilization. . . 49 Figure 3.14 Impact of ∆r on service capacity. . . 50

(11)

List of Abbreviations

AP Access point

AR Augmented reality

BS Base station

CDF Cumulative distribution function DVFS Dynamic voltage and frequency scaling FCFS First-come-first-serve

FLC Fuzzy logic control FPC Farthest point clustering LoS Line-of-sight

MDP Markov decision process MCC Mobile cloud computing MEC Mobile edge computing

NFV Network functions virtualization POI Point of interest

QoS Quality of service

RACS Radio applications cloud server RAT Radio access technology

RSRP Reference signal received power RSS Received signal strength

SDN Software-defined networking UAV Unmanned aerial vehicle

UE User equipment

VBP Variable-sized bin packing VANET Vehicular ad hoc networks

(12)

ACKNOWLEDGEMENTS

I would like to express special thanks to my supervisor, Prof. Jianping Pan, for giving me all kinds of opportunities during my Master’s program. When I entered UVic, I was determined to copy all his good points. He taught me how to do research, how to schedule things and how to treat others professionally. I respect him.

I am particularly grateful for my supervisory committee member, Prof. George Tzanetakis. The first course I took in UVic is his data mining. Thanks for always encouraging me whether in course, directed study or final thesis.

I admire the smart lady, Prof. Lin Cai, for her attitude to solve research problems and excellent ability to handle matters. Thanks to Prof. Ming Ling and his wife for always inviting me to have dinner together. Thanks for the generous help from all my research group members. I wish you all the best.

Thanks to my parents, Tao Wang and Xinglun Chen, for supporting me all the time. They accompany me through ups and downs. Thanks to my love, Kaiyang, for not only giving constructive suggestions on my research but also taking good care of me in life. I love you.

Looking back on the past two years, I am lucky enough. Hope the hard work deserves the luckiness.

(13)

DEDICATION To my lovely parents.

(14)

Introduction

1.1 Mobile edge computing

In this section, the research background, history and development of MEC are intro-duced. Furthermore, application scenarios and current research topics are summa-rized.

1.1.1 MEC background

Mobile devices have played an essential role in human life, especially smart devices with advanced multimedia and computation capabilities. Users can not only experi-ence convenient communication services but also enjoy various mobile entertainment applications. Benefiting from the rapid development of mobile computing, boom-ing computation-intensive and delay-sensitive applications have proliferated in recent years, e.g., live video analytics, face recognition and augmented reality (AR) [1]. However, limited by the computation resources and the battery life of mobile devices, the quality of service (QoS) of these applications is hard to be guaranteed [2].

To tackle the limitation of mobile devices, mobile cloud computing (MCC) has emerged to offload computation-intensive tasks to remote clouds. It allows mobile user equipments (UEs) to use cloud infrastructures such as servers, storages, operating systems and application programs [3]. Mobile UEs first need to access mobile networks through base stations (BSs) or access points (APs). Next, network operators send the requests/data from UEs to remote clouds via the Internet. Then, cloud controllers will match the requests with the corresponding application servers. In this way, data can be processed in centralized clouds with much more powerful computation

(15)

capabilities. According to Cisco, data center traffic will reach 20.6 Zettabytes by 2021 [4]. A large amount of mobile data traffic puts a heavy load on the core network and the bandwidth of backhaul networks becomes a bottleneck. Moreover, multi-hop communications between mobile UEs and remote clouds are needed. This becomes a critical challenge for latency-sensitive applications.

To cope with the challenge of MCC, mobile edge computing (MEC) has attracted extensive attention by processing data at the edge of the network. Computing capabil-ities and storage resources are distributed in close proximity to mobile UEs. Generally, edge can be defined as any devices between mobile UEs and clouds. Specifically, edge servers can be deployed at BSs, APs or multi-radio access technology (RAT) cells. As data transmission between mobile UEs and edge servers remains at the network edge, data traffic in the core network can be reduced significantly. Therefore, MEC over-comes the backhaul communication bottleneck and thus provides low-latency services for mobile UEs [5]. Taking MAUI [6], a well-known system that offloads tasks to a nearby server, as an example, the task latency of face recognition (or video game) reduces from 18 (or 1.5) s to around 1.5 (or 0.3) s, and the task offloading saves the energy consumption of mobile devices by 90% (or 27%).

The development of MEC started from cloudlet in 2009. Satyanarayanan et al. [7] proposed that real-time computation-intensive tasks can be offloaded to nearby cloudlets through one-hop high-bandwidth wireless access links. Cloudlet is a cluster of virtual machines (VMs) with rich computation capacities and storage resources. Nevertheless, distant clouds and local execution can be the alternative choices if no cloudlets are available in physical proximity. Then, Bonomi et al. [8] from Cisco pre-sented a similar concept fog computing in 2012, which is a virtualized platform to provide network access and computation resources located between UEs and remote clouds. In 2013, IBM and Nokia jointly implemented the first MEC platform, radio applications cloud server (RACS), which is integrated with BSs [9]. In 2015, the Euro-pean Telecommunications Standards Institute (ETSI) standardized MEC [10]. After that, collective cooperation among telecom companies and universities was launched, e.g., the Open Edge Computing initiative (Intel and Carnegie Mellon University (CMU), etc.) [11], and the Open Fog Consortium (Cisco and Princeton University, etc.) [12].

(16)

1.1.2 MEC scenarios and applications

Recent years have witnessed the flourish of MEC-assisted applications. Based on the characteristics and features of different mobile applications, three main scenarios are summarized as follows: 1) multimedia, which requires high bandwidth for data transmission; 2) IoT and smart city, which consists of massively connected devices; and 3) vehicular ad hoc networks (VANET), where UE mobility counts.

Multimedia

MEC provides low-latency content delivery and efficient data processing services at the edge of the network. The performance of booming 5G multimedia applications thus can be enhanced by MEC, such as AR, live video analytics, image processing, and object recognition [13]. Taking AR as an example, edge servers can conduct the real-time analysis of the UE position, device direction, and camera view with low latency. In addition, MEC can also help improve network performance. Taking video streaming as an example, downlink capacity can be analyzed at edge servers and then sent to the video content server [10]. This assists TCP congestion control to reduce the video-stall occurrences.

IoT and smart city

As a large number of devices are connected through various radio technologies, MEC can help manage different protocols and process big data generated by these massive devices [10]. On the one hand, raw data collected by IoT devices can be extracted and processed through edge analytics first. Thus, the amount of data transmitted to remote clouds shrinks tremendously. On the other hand, security and privacy preser-vation can be enforced prior to the data being transmitted to the clouds. Therefore, the wide-spread geographical distribution of edge servers supports IoT, smart city and big data-related applications in terms of scalability. Applications in this scenario can be categorized into human-oriented services, e.g., healthcare and shopping cart up-date, and city-oriented services, e.g., smart grid, smart building control, environment monitoring and public safety [14].

(17)

VANET

Traffic information needs to be exchanged and distributed in real time, which sets a high requirement in timeliness and reliability. MEC supports commercial roadside functionalities by enhancing the interaction and cooperation between connected ve-hicles and roadside units. Applications lie in automated driving, hazard warnings, traffic lights control, navigation, and parking services [15, 16]. Taking automated driving as an example, intelligent automobile operation is conducted based on the real-time precise road information. Edge servers can analyze the data collected by the onboard sensors with powerful computation capabilities and use the extracted information to guide the steering control and route planning [16]. Meanwhile, the traffic information can be quickly disseminated to nearby vehicles and adjacent edge servers as well as remote clouds for further analysis.

1.1.3 Research topics in MEC

The challenges of the aforementioned applications lead to various research topics in MEC. Four hot topics, i.e., task offloading, resource allocation, caching and VM migration policies, are introduced as follows, which have been intensively studied recently.

• Task offloading refers to 1) whether to offload the task to the edge server or not, and 2) what the best offloading plan (full or partial offloading) is. Existing work concentrated on either static task offloading where the locations of UEs are fixed during the offloading process [17], or dynamic offloading scheme with the consideration of UE mobility [18, 19]. The main objective of the offloading decisions can be the minimization of task latency, energy consumption or their trade-off.

• Resource allocation refers to how to manage the radio and computation re-sources efficiently for task offloading. Task offloading consists of uplink data transmission, data processing at the edge servers and downlink data transmis-sion. The transmission precoding matrices and CPU cycles assigned to each UE should be allocated effectively for efficient task execution [20].

• Caching policy refers to which contents need to be cached at which edge servers, considering various contents requested by mobile UEs or needed for

(18)

data processing. Content popularity and caching policies need to be addressed to improve system capacity and reduce content delivery latency [21].

• VM migration: refers to 1) whether the offloaded tasks need to be migrated among edge servers or not, and 2) how to return the computation results back to the users when they are moving around. VM migration can be initiated based on the experienced task latency and the availability of computation resources [22]. Moreover, VM replicas can also be deployed to maintain a low end-to-end delay. All these research efforts try to explore the relationship between mobile UEs and edge servers. Mobility management thus becomes a common and fundamental prob-lem of the above topics. It addresses which mobile UEs need to be associated with which edge servers and when to hand over. In existing mobility management schemes, communication costs dominate and the channel quality is measured and reported periodically for handover decision such as the received signal strength (RSS). For low-latency services, how to adjust handover criteria and deal with user association problem in MEC is further investigated in this thesis.

1.2 Research problems and contributions

In this thesis, we focus on mobility management in MEC: how mobile UEs determine and hand over to the appropriate edge servers to achieve low-latency services. Gen-erally, mobility management consists of the following six phases: 1) cell search, 2) access control, 3) cell identification, 4) cell selection/reselection, 5) handover decision, and 6) handover execution [23]. Mobile UEs need to identify nearby BSs, measure the corresponding signal quality and make handover decisions based on the pre-defined criteria. The current handover decisions are mostly event-triggered and threshold-based, e.g., a handover will be triggered if RSS from the candidate BS is higher than that from the current BS exceeding a threshold.

However, different from existing mobility management which only takes the wire-less channel condition into account, handover decisions and the corresponding criteria in MEC need to be redefined by considering both the communication and computa-tion constraints [24]. Mobile UEs may gather at certain places, thus leading to nearby edge servers overloaded and others underloaded. Meanwhile, edge server deployment also affects the performance of mobility management. When edge servers are densely deployed in hot-spot areas, mobile UEs may have multiple choices to offload their

(19)

tasks to edge servers. Considering the dynamic network due to UE mobility, full system information, e.g., channel conditions and server workloads, is hard to be syn-chronized for optimal decision-making. When edge servers are sparsely deployed, the options of task offloading are limited. QoS of mobile UEs in hot-spots areas is hard to be guaranteed.

Considering these two scenarios discussed above, two problems are addressed in this thesis: 1) with densely deployed edge servers, for each mobile UE, how to choose the appropriate edge servers independently without full system information is inves-tigated, and 2) with sparsely deployed edge servers, how to serve dynamic hot-spot areas in an efficient and flexible way is emphasized.

1.2.1 Q-learning-based mobility management scheme

In the fixed-to-mobile edge computing with densely deployed fixed edge servers, mo-bile UEs are surrounded by multiple available edge servers. To choose the appropriate edge servers for low-latency services, the handover decision-making in mobility man-agement is redefined by jointly considering the communication and computation con-straints. A reinforcement-learning-based mobility management scheme is proposed to deal with network uncertainties such as server workload. Q-learning, a typical model-free reinforcement learning to identify the optimal action-selection policy, is introduced to solve this problem. UEs learn the optimal association policy by inter-acting with edge servers. The state and action refer to the current connected edge server and handing over to the target edge server, respectively. Each pair of action and state has a Q-value, i.e., the expected cumulative rewards. Each UE updates Q-values based on the experienced task execution speed, which is defined as the re-ciprocal of the task latency. -greedy strategy is introduced to balance exploration and exploitation. UEs either randomly select an action with probability , or se-lect the action with the highest Q-values in the current state with probability 1 − . Simulation results show that the proposed learning-based scheme excels at dealing with the system information uncertainties. When compared with the existing RSS-based mobility management scheme, it reduces the task latency while maintaining a relatively low energy consumption of mobile UEs.

(20)

1.2.2 Online mobile edge server dispatching scheme

Considering the nonuniformly distributed tasks and dynamic demands, a mobile-to-mobile edge computing scenario is introduced where the mobility of both edge servers and UEs is considered. As the sparsely deployed fixed edge servers are not flexible enough, unmanned aerial vehicle (UAV)-mounted edge servers are dispatched to serve hot-spot areas. User association problem is then formulated as a variable-sized bin-packing problem with geographic constraints. Next, an online mobile edge server dispatching scheme is proposed to determine the hover locations of UAVs in which tasks are geographically merged into several hot-spot areas. Theoretical analysis guarantees the worst-case performance bound. In addition, a hybrid dispatching scheme is further proposed in which mobile edge servers are dispatched to assist the fixed server deployment. The performance of fixed edge server deployment, mobile edge server dispatching, and the hybrid scheme is evaluated in terms of the number of the served UEs, service latency fairness and resource utilization. Simulation results show that while maintaining a good latency fairness, the proposed mobile server dispatching scheme can serve more UEs as well as achieving a high resource utilization. Moreover, the hybrid scheme can satisfy even more UE demands while dispatching fewer UAVs with a better server utilization.

1.3 Thesis Outline

Chapter 1 contains MEC background, research topics, and contributions in this thesis followed by the thesis structure.

Chapter 2 focuses on mobility management and user association when mobile users have multiple choices to offload their tasks to the fixed edge servers. A Q-learning-based mobility management scheme is proposed to handle system in-formation uncertainties.

Chapter 3 addresses the inefficiency of the fixed server deployment. UAV-mounted edge servers are employed for flexible edge services. The mobile edge server dispatching problem is formulated as a variable-sized bin-packing problem with geographic constraints. An approximation algorithm is proposed to solve this problem with the worst-case performance guarantee.

(21)

(22)

Chapter 2 Q-learning-based Mobility

Management under Uncertainties

for Mobile Edge Computing

2.1 Introduction

To increase the system capacity and improve user experience, BSs equipped with edge servers are densely deployed in hot-spot areas. Thus, mobile UEs have multiple choices to offload their tasks to edge servers. Driven by dense BS deployment, han-dover decisions have gradually evolved from the traditionally cell-centric to a more flexible user-centric mobility management [25,26]. In the traditional handover scheme, when more and more UEs move toward the region of a certain BS, UEs keep connect-ing to the BS based on RSS. In MEC, this particularly deteriorates user experience in hot-spot areas and, meanwhile, makes certain BSs overloaded. However, UEs could have the opportunities to offload the tasks to other neighboring BSs whose resources remain unused. Thus, not only the channel condition but also the computing capacity should be considered when associating UEs with the appropriate BSs. Moreover, user mobility aggravates the rapid alteration of the system. It is hard to obtain accurate full network information for an appropriate UE-BS association. Therefore, faced with performance deterioration and system uncertainties, mobility management is still of great importance and should be addressed in MEC scenarios.

Generally, mobility management focused on whether and where to hand over. De-marchou et al. [27] proposed a user-centric handover scheme where a handover occurs

(23)

according to UE’s future location predicted by the trajectory and velocity. Hasan et al. [28] adjusted handover parameters depending on the overloaded BSs and the adjacent BSs by employing an adaptive threshold to determine the overloaded BSs. Most existing work is based on the whole system information, such as network con-ditions, predicted UE mobility pattern, BS side information, and future information. System uncertainties need to be addressed and handled. Previous efforts worked on reinforcement learning algorithms where each UE interacts with BSs and identifies the optimal action-selection policy for traditional cell selection [29–32]. Only communi-cation rewards such as gained system capacity or network throughput are considered. However, applicable handover criteria in MEC are not well studied in the literature. As MEC aims to finish tasks as soon as possible, both the communication cost and computation limits need to be considered. This motivated us to propose a generic solution for mobility management to deal with the system information uncertainties in MEC.

In this chapter, a Q-learning-based mobility management scheme is proposed to handle the uncertainties of MEC-enabled networks. To shorten the service latency, each UE makes handover decisions based on the experienced task latency by selecting the action with the highest Q-values in the current state. To make a trade-off between exploration and exploitation, UEs keep exploring different BSs with probability as well as trying to connect to the optimal BS so far with probability 1 − . After UEs connect to the new BS, their states change accordingly and the corresponding Q-values are updated to keep up with the dynamic network conditions.

The rest of the chapter is organized as follows. Related work is introduced in Section 2.2. The communication model and the computation model are introduced in Section 2.3. The mobility management scheme based on the redefined handover criteria is proposed in Section 2.4. To tackle the system uncertainties, a novel Q-learning-based mobility management scheme is proposed in Section 2.5. The perfor-mance of the proposed algorithm and the impact of key parameters are evaluated in Section 2.6. Finally, the conclusion is summarized in Section 2.7.

(24)

2.2 Related work

2.2.1 Mobility management

The majority of mobility management schemes are event-triggered and threshold-based. Various approaches have been proposed for handover decisions, i.e., signal-based approach, cost function-signal-based approaches, Markov decision process (MDP), fuzzy logic control (FLC), and game theory, etc [33]. In the signal-based approaches, a handover is triggered if the measured downlink RSS from the target BS is stronger by a threshold than that from the current BS. The channel measurements are processed by mobile UEs and the measurement report is sent back to the BSs periodically. Fig. 2.1 shows RSS of a mobile UE from adjacent BSs, which is generated based on the path loss model, shadow fading model, and multipath fast fading model. Details can be found in Section 2.3.2. When the mobile UE moves from the source BS to the target BS, RSS from the source BS decreases and that from the target BS increases. Handovers will be triggered on the border between these two BSs. In the cost function-based approaches, the defined cost can be bandwidth, handover latency or their combination [34]. 180 200 220 240 260 280 300 Distance (m) -160 -150 -140 -130 -120 -110 -100 -90

Received signal strength (dBm)

RSS

Source

RSS

Target

Figure 2.1: RSS from adjacent BSs.

However, most existing work has focused on communication costs. In MEC, as tasks are offloaded to edge servers, computation costs, such as the task execution latency and energy consumption, also need to be considered. Thus, the most appro-priate BSs may not be the ones with the best channel conditions. The corresponding handover decision criteria should be redefined. Wang et al. [35] used the exhaustive search method to find the BSs with the minimum energy consumption. To maximize

(25)

the system utility, Tan et al. [36] formulated the user association problem to a convex problem through variable relaxation. The limitation of this existing work is that it is hard to obtain full system information such as time-varying server workloads and UE trajectories.

2.2.2 Preliminaries of Q-learning

The main challenges in the mobility management decision-making are the uncer-tainties of the whole network information such as BS workloads, future channel conditions, and UE trajectories. Even if the whole information is available, the rapid changes of the system are hard to be synchronized for decision-making. With only local observations, learning algorithms, especially reinforcement learning, can help agents learn from the previous experience to gain awards and make decisions without the complete information. Q-learning is a typical model-free reinforcement learning to identify the optimal action-selection policy. It has been widely used in wireless communications for different decision-making scenarios, such as RAT/cell selection, resource allocation, and interference control [37, 38].

Q-learning Environment

State

Action Reward

Figure 2.2: Agent-environment interaction in Q-learning.

Q-learning belongs to incremental dynamic programming as the agent finds the optimal policy in a step-by-step manner [39]. To maximize the reward, each agent repeatedly interacts with the environment to learn how to take actions in a specific state, as illustrated in Fig. 2.2. With the consequences in terms of rewards (or penalty), the agent updates the action-values, i.e., Q-values, and learns the optimal action through the learning procedure.

At each decision epoch, e.g., time t, the agent takes action at based on the

Q-values of all actions in its current state st, and then its state changes to st+1. With

the state-action pair (st, at), the action-value function Q is updated after observing

(26)

Q(st, at) =(1 − α)Q(st, at) + α n R(st, at) + γ h max a0 Q(st+1, a 0 )io, (2.1) where α ∈ (0, 1] is the learning rate which determines to what extent the recent observation overrides the experience, γ ∈ [0, 1] is the discount factor which denotes the effect of the future reward on the current state value, and a0 denotes all actions in the next state. R is the reward, which can be defined as the task execution speed in this work. Details can be found in Section 2.5. Q(st, at) is updated by a weighted

average of the Q-value with the action taken in the next state. The approximation of the expected future cumulative reward V (st) and the optimal policy πOPT(st) thus

can be selected as follows

V (st) ← max

a Q(st, a), (2.2)

πOPT(st) ← arg V (st). (2.3)

As mentioned above, the agent in Q-learning tries to stay on the action that maximizes the expected reward which is called exploitation. However, it results in suboptimal stable equilibria without gaining more benefits [40]. Sometimes, the agent may also need to obtain new knowledge of the environment to improve its performance by selecting the actions it has not tried before. This is called exploration. How to keep a balance between exploitation and exploration is of great importance in reinforcement learning. Taking the well-known -greedy policy as an example, at each decision epoch, the agent randomly selects a possible action with probability or insists on the action with the highest Q-values with probability 1 − .

2.3 System model

In this section, the user mobility model and the task generation model are first intro-duced. Then, the communication model and the computation model ensue.

2.3.1 User mobility and task generation

As shown in Fig. 2.3, BS n ∈ N (with |N | = N ) equipped with computing capabilities is densely deployed in a hexagonal grid. UE i ∈ I (with |I| = I) can access the edge

(27)

Mobile user trajectory MEC server Base station

Figure 2.3: The architecture of MEC with UEs.

server through a cellular BS and then a wired connection [41]. UEs are moving around following the random waypoint model [42]. Mobile UEs repeatedly pause for a certain time, select a random direction and then move with a random speed and time. The pause interval, direction, speed and walk interval are randomly and independently selected from the predefined value intervals. Other complex mobility models are also applicable to this work.

!

Task execution

Task execution Task ExecutionIdle state

!

µ

Figure 2.4: Transition diagram of task generation.

As shown in Fig. 2.4, UEs alternates from the execution state to the idle state and vice versa. The task intensity p per hour per UE is the transition rate from the idle state to the execution state, and µ otherwise. During the execution state, the task can be either executed locally or offloaded to the edge server. After executing the task, the UE will return to the idle state.

(28)

2.3.2 Communication model

The path loss (in dB) between BS n and UE i at time t can be expressed as

P Lt,n,i= P L(d0) + 10ς log10

dt,n,i

d0

, i ∈ I, n ∈ N , (2.4)

where P L(d0) is the pathloss at the reference distance d0, dt,n,i ≥ d0 is the distance

between BS n and UE i at time t, and ς is the path loss exponent.

Based on the path loss model, RSS (in dBm) from BS n for UE i at time t is calculated as

P_t,n,iU =PU− P Lt,n,i− Xσ− 20 log10|h|, i ∈ I, n ∈ N , (2.5)

where Xσ denotes the shadowing loss (in dB) and obeys a Gaussian distribution with

zero mean and standard deviation σ, h denotes the multipath fast fading channel gain [43], which can be modeled as a Rician distribution with K-factor, K is the ratio of direct-path power and diffuse power, and PU _{(in dBm) is the transmitted power}

of UE i. For simplicity, we assume that all UEs have the same transmission power. The maximum uplink transmission rate for UE i at time t can be calculated as

Bt,n,i= B0log2 1 +

10Pt,n,iU /10

It,n,i+ N0

!

, (2.6)

where B0 is the channel bandwidth, It,n,i is the mutual interference caused by other

UEs [44] and N0 is the noise power.

2.3.3 Computation model

Local task execution

With the dynamic voltage and frequency scaling (DVFS) technology, UEs can adjust their computing capacity for different tasks [44]. Therefore, the local task execution time is denoted as

D_t,i,mL = PmAm ft,i,m

(29)

where Pm is the computation intensity, Am is the data size of task m executed in UE

i, and ft,i,m is the allocated CPU frequency for task m.

The energy consumption of local task execution is often modeled as polynomial energy cost functions, which can be calculated as [45]

E_t,i,mL = (ηi(ft,i,m)Xi + βi)DLt,i,m, (2.8)

where Xi and ηi are the parameters of the energy curve which differ from device

to device. Typically, Xi ≤ 3 [46]. βi represents the energy consumption of leakage

currents.

Task execution on the edge server

The computation latency of processing the task is computed as D_t,i,mC = PmAm

QE , (2.9)

where Pm is the computation intensity of task m, and Am is the data size of task m

offloaded to BS n. A first-come-first-serve (FCFS) manner is considered as the task processing strategy [47]. Other sophisticated models can also be applied to our work. For simplicity, we assume all BSs have the same computation capacity QE_.

For UE i, the transmission latency of task m at time t is given by DComm._t,i,m = Am

Bt,n,i

, (2.10)

As multiple UEs are competing for the limited computation resources at the edge server, the task queuing latency of BS n at time t is considered as follows

DQ_t,i,m = ω

QE, (2.11)

where ω is the current workload of BS n at time t, which is changing with time. As shown in Fig. 2.5, the overall time latency of the edge service DE

t,i,m consists

of the transmission, queuing and computation latency

DE_t,i,m = D_t,i,mComm.+ DQ_t,i,m+ DC_t,i,m. (2.12) Similar to [44,48], the data transmission also consumes the tail energy in practice.

(30)

Offloaded tasks from users

Edge execution

Transmission delay Queuing delay Execution delay Figure 2.5: Latency model of task offloading.

Thus, the energy consumption for the transmission of task m is E_t,i,mT = P_iT· DT t,i,m+ P tail i · D tail i , (2.13) where PT

i is the transmission power of UE i, Pitail is the cellular tail power

consump-tion of UE i, and Dtail

i is the tail time.

The energy consumption during the queuing and execution time is

E_t,i,mQ = P_iidle(D_t,i,mC + D_t,i,mQ ), (2.14) where Pidle

i is the idle power of UE i.

Thus, the total energy consumption of the task offloading consists of the trans-mission energy consumption and idle energy consumption, which is calculated as

E_t,i,mE = E_t,i,mT + E_t,i,mQ . (2.15) Considering that the result of a task is relatively small when compared with the transmitted data, similar to the previous studies [24, 44], the downlink transmission latency is not considered in this work.

2.4 Mobility management with full information

Unmanaged mobility in the wireless environment causes communication disruption when the UEs are moving around [49]. In the existing network, a handover procedure is triggered when RSS of the serving BS drops below a certain threshold than the best available BS. However, in the MEC-enabled networks, as BSs need to provide both communication and computation services, using the same set of handover criteria may degrade QoS. In this section, we adapt the handover criteria to the MEC scenario.

(31)

Algorithm 1 Delay-based Mobility Management Scheme (base line)

1: Initialize the UE connection state by associating the UE to the BS with the strongest RSS;

2: for t = 1, . . . , T do

3: if ∃ Task m then

4: Calculate the expected latency for each available BS as in (2.12);

5: Select the BS with the shortest expected latency;

6: else

7: Select the BS with the strongest RSS in excess of a certain threshold when compared with the serving BS;

8: end if

9: end for

reduced if UEs connect to the nearby underload BSs whose edge server can complete the task rapidly. Thus, as a benchmark, we first describe a simple but efficient mobility management scheme based on the accurate full network information, which is assumed to be synchronized in real time.

The delay-based mobility management scheme with full information (DFI) is sum-marized in Algorithm 1. It can be considered as a greedy optimization approach. Each UE is initially connected to the BS with the strongest RSS. During the time slot with generated offloaded tasks, each UE calculates the expected task latency based on the task information, channel condition, and BS workloads, and then greedily chooses the BS with the shortest task latency. If there is no task to offload, UEs will connect to the BS with the strongest RSS.

2.5 Mobility management with partial information

The DFI algorithm mentioned above requires full system information such as server workloads and channel conditions. However, it is hard to maintain the accurate real-time system information due to the real-time-varying network and UE mobility. To tackle the limitation of network uncertainties, Q-learning algorithm is introduced to solve the mobility management problem with partial information. Our objective is to create an intelligent mobility management scheme in which mobile UEs automatically learn the handover strategy through trial and feedback. The aspects of the Q-learning algorithm in mobility management problem at time t are defined as follows:

(32)

Algorithm 2 Q-learning-based Mobility Management Scheme

1: Initialize action-value function Q(s, a) ← 0;

2: Initialize state s by associating the UE to the BS with the strongest RSS;

3: for t = 1, . . . T do

4: if ∃ Task m then

5: With probability select a random action at, otherwise select action at=

max

a Q(st, a);

6: Execute action at, observe reward R(st, at) and obtain the next state st+1;

7: Update action-value Q according to (2.1);

8: else

9: Select the BS with the strongest RSS in excess of a certain threshold when

compared with the serving BS;

10: end if

11: end for

to achieve the shortest task latency in MEC.

• State: The state is defined as St = {st = n × l | n ∈ N , l ∈ L}, jointly

considering the serving BS n and the current channel level l. Based on RSS, l is devided into three levels: > −130 dBm, −130 ∼ −140 dBm, and < −140 dBm.

• Action: The action, i.e., choosing the target BS, is the decision made by the agent. The set of actions per state is defined as At = {at = n0|n0 ∈ N } 1. By

taking action a ∈ At, the agent transitions from the current state to the next

state.

• Reward: The reward for executing an action atin state stis the task execution

speed, which is defined as R(st, at) = Am/Dt,i,mE .

The proposed Q-learning-based mobility management scheme with partial infor-mation (QPI) and the corresponding process are summarized in Algorithm 2 and Fig. 2.6. The action-value function and each UE state are first initialized. At the beginning, each UE is connected to the BS with the strongest RSS. In each time slot, -greedy policy is introduced to solve the trade-off between exploitation and explo-ration. UEs keep exploring different BSs as well as trying to connect to the optimal BS based on the experienced task execution speed. In exploitation, UE i connects to the BS with the highest Q-values in the current state. In exploration, UE i randomly

(33)

UE Without task With task Handover decision Q-learning 𝜀-greedy Reward Update Strongest RSS

Figure 2.6: Flow chart of the Q-learning-based mobility management scheme.

selects a BS for task offloading. UE i thus observes the experienced task execution speed as the reward. Then it updates the corresponding Q-value accordingly. Similar to DFI, if there is no task to offload, UEs will choose the target BS with the strongest RSS.

In theory, the Q-learning algorithm has been proven to converge towards the opti-mum when the state-action pair is infinitely visited and the learning rate is decreased to zero [39]. However, when UEs move towards another BS, the environment varies and the Q-values need to be updated accordingly. In this case, the UE’s adaption to the system also alters the system itself, which triggers other UEs to re-adjust their strategies. Thus, the estimation is never completely convergent but continues to change in response to the dynamic system. This is desirable in the nonstationary system [50]. On the other hand, the exploration step encourages the UE to select an action independently of the state-action estimates. Although this may cause unneces-sary handovers, handover decisions are more flexible if alternative strategies appear.

2.6 Performance evaluation

In this section, extensive simulations are conducted to evaluate the performance of the proposed scheme.

(34)

Table 2.1: System parameters in simulation. Parameter Value Parameter Value

PU _{46 dBm} _I ₃₅₀

QE 5 GHz P 31680 cycles/bit

A 60 kB K 3

B0 20 MHz N0 -174 dBm

2.6.1 Simulation setup

Considering an exhibition event scenario in Fig. 2.7, 4 BSs are deployed in the hexag-onal grid with radius 80 m and 350 UEs are moving around between the BSs. The random waypoint model is used as the UE mobility model with speed v ∈ [0.2, 2.2] m/s, walk interval w ∈ [2,6] s, and pause interval p ∈ [0,1] s. A typical computation-intensive application, face recognition, is considered in this scenario, which requires the comparison with a large database stored on the edge server [6, 51]. As shown in Table 2.1, other simulation parameters are selected based on [24, 44]. By default, α, γ, and p are 0.75, 0.5, 0.5, and 55, respectively.

0 50 100 150 200 250 300 Distance (m) 0 50 100 150 200 250 300 350 400 Distance (m) BS 1 BS 2 BS 3 BS 4 BS UE

Figure 2.7: Simulation topology.

For comparisons, in addition to DFI and QPI, two other benchmarks are intro-duced to evaluate the performance of the two proposed algorithms:

• RSS-based mobility management scheme with full information (RFI): UEs al-ways connect to the best available BS based on the channel condition.

• Local task execution (LOC): UEs always execute tasks locally and select BSs based on RSS.

(35)

2.6.2 BS-side performance evaluation

500 1000 1500 2000 2500 3000 3500 Time (s) 1000 1200 1400 1600 1800 2000

Variance of BS workload (cycle

2 )

DFI QPI RFI

Figure 2.8: Variance of the BS workload.

Through the temporal distribution of the random waypoint model, people may gather at the center of the exhibition. Thus, if UEs are assigned based on RSS, BS 1 and BS 4 can be overloaded when compared with BS 2 and BS 3. From the BS side, the decision-making in mobility management affects the workloads of each BS. The workloads are changing over time as tasks are generated and finished. As shown in Fig. 2.8, the variance of the BS workload is evaluated. A higher variance indicates the unbalanced workload among different BSs. When more and more UEs move towards a certain BS, the BS may be overloaded with RFI. However, DFI shows the ability to balance the BS workload as the UE with full system information always chooses the BS that can complete the task as soon as possible. When compared with DFI, QPI manifests the capability of handling the system uncertainties although with some learning loss. As executing task locally does not affect the BS workload, LOC is not evaluated in this performance metric.

2.6.3 UE-side performance evaluation

The impact of the task intensity on the average per-task latency is shown in Fig. 2.9. The larger the task intensity is, the more tasks generated by UEs. This indicates that the edge server may receive more offloaded tasks. The average per-task latency in LOC is limited by UE’s local computation capacity and remains constant in this case, since the UEs executing tasks locally are not affected by the edge server. In DFI, QPI, and RFI, offloading tasks to the edge server helps reduce the average per-task la-tency, when compared with LOC. Meanwhile, both QPI and DFI perform better than

(36)

40 45 50 55 60 65

Task intensity p per hour per UE

0.4 0.6 0.8 1 1.2 1.4 Average delay (s) DFI QPI RFI LOC

Figure 2.9: Average per-task latency.

RFI by jointly considering the channel condition and computing capacity. Moreover, RFI is more sensitive to the task intensity while QPI and DFI can balance the BS workload effectively. That is, if all users select a BS that provides the best wireless channel condition, the BS can be overloaded and the tasks need to wait for a relatively long time to be processed at the edge server. In addition, the channel conditions be-come worse with stronger interference when more UEs are transmitting tasks, thus increasing the data transmission latency in task offloading. What distinguishes the QPI algorithm is that it does not need to know the full system information. Each UE makes decisions based on their experienced task execution speed as well as updating the Q-values.

40 45 50 55 60 65

Task intensity p per hour per UE

10-1 100 101 102

Average energy consumption (J)

DFI QPI RFI LOC

Figure 2.10: Average per-UE energy consumption.

The impact of the task intensity on the per-UE energy consumption is shown in Fig. 2.10. With the increase of the task intensity, the average per-UE energy

(37)

consumption increases. When the task is offloaded to the edge server, UEs only need to transmit the task to the BS and waits until the task is accomplished. The longer the task offloading takes, the more energy it consumes. However, considering the sharp difference when compared with LOC, the other three schemes all maintain a relatively low energy consumption level, where the proposed QFI scheme does not require any prior knowledge of the system.

The impact of the task intensity on the handover frequency is shown in Fig. 2.11. DFI and QPI are sensitive to the disturbance of the system as the workload of the edge server changes over time. LOC and RFI are quite stable as the decision-making is triggered only based on RSS, which is mainly affected by the relative location between the UE and the BS. With the increase of the task intensity, handover frequency increases dramatically in DFI because the UE always tries to connect with the BS that completes the tasks earlier. QPI does not always perform the best in every performance metric. However, QPI acts like a trade-off between the task latency and the handover frequency. Moreover, the gap between QPI and the best baseline is caused by the trial and feedback at the beginning. More experience results in better approximation.

40 45 50 55 60 65

Task intensity p per hour per UE 0 20 40 60 DFI QPI RFI LOC # of ha ndove rs pe r hour pe r UE

Figure 2.11: Per-UE handover frequency.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Greedy probability 0.79 0.8 0.81 0.82 0.83 0.84 Average delay (s) 1.12 1.125 1.13 1.135 1.14 1.145

Average delay

Average energy consumption

Figure 2.12: Impact of .

The impact of three parameters, i.e., , γ and α, on Algorithm 2 is shown in Fig. 2.12, Fig. 2.13 and Fig. 2.14, respectively. The change of the energy consumption is consistent with that of the task latency. As shown in Fig. 2.12, the larger is, the higher the probability that a UE randomly chooses a BS to fully explore the state space is. When is 0 in QPI, UEs make handover decisions only based on the experienced task execution speed and takes the action with the largest Q-values.

(38)

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Discount factor 0.708 0.71 0.712 0.714 0.716 0.718 0.72 Average delay (s) 1.08 1.085 1.09

Average delay

Average energy consumption

Figure 2.13: Impact of γ. 0 50 100 150 Learning iterations 0 200 400 600 800 1000 Q values = 0.2 = 0.4 = 0.6 = 0.8 Figure 2.14: Impact of α.

However, excessive introduces a stronger randomness. This indicates a trade-off between exploitation and exploration. Choosing an appropriate encourages UEs to both explore the state space and try to connect to the highest-ranked BS according to their experience. As shown in Fig. 2.13, a small discount factor γ makes UEs near-sighted by only considering the current rewards, while with the increase of γ, UEs value the future rewards more than the current rewards and strives for a long-term reward. Fig. 2.14 evaluates the impact of learning times and learning rate on the sum of Q-values per UE. For each UE, the sum of all Q-values keeps increasing and then become stable when UEs interact with the environment with more offloaded tasks. Moreover, a higher learning rate α accelerates the learning procedure as the UE learns more about the difference between the recent observation and the experience.

2.7 Conclusions

In this chapter, a Q-learning-based mobility management scheme is proposed to tackle the challenges of mobility management in the MEC. The network changes rapidly be-cause of user mobility. To handle the uncertainties of the system information, UEs make decisions by interacting with the environment and keep updating their experi-ence based on the experiexperi-enced task execution speed. Simulations show that the pro-posed scheme is superior to the traditional RSS-based mobility management scheme in reducing the task latency while maintaining a relatively low energy consumption.

(39)

Chapter 3 Online UAV-mounted Edge Server

Dispatching for Mobile-to-Mobile

Edge Computing

3.1 Introduction

Various research topics, e.g., task offloading, caching, and resource allocation, are all based on the assumption that edge servers have been placed already [52–54]. How and where to deploy edge servers needs to be addressed. Several solutions have been pro-posed to locate edge servers, among which, deploying more edge servers to hot-spot areas outperforms the uniform deployment [55]. Due to user mobility and dynamic demands, hot-spot areas at the current time may cool down soon afterward. To serve time-varying crowds, Yin et al. [56] mapped user clusters to the fixed edge servers periodically to reduce the infrastructure cost. However, due to unevenly distributed tasks, some fixed edge servers are unavoidably overloaded while other servers are idle. Therefore, techniques such as task migration need to be introduced to balance the workload among edge servers. This, in turn, results in extra communication and signaling overhead, and increases task latency as the tasks need to be transferred be-tween servers [57]. On-demand network deployment [58] has been seen as a promising proposal to serve dynamic hotspots for big events or disaster recovery. Meanwhile, it can improve the computing resource utilization with the on-demand provisioning when compared with the BS sleeping technologies.

(40)

way to assist the wireless communication networks [59]. With the development of UAVs, deploying edge servers on them draws significant attention due to their flexible mobility [60]. The limitation of flight time and battery power can be tackled by energy harvesting technologies, e.g., solar power for 28+ continuous flight hours [61, 62]. Moreover, general commercial UAVs such as DJI MATRICE 600, DJI S900 and Tarot T-18 can take off with 6∼8 kg payloads while heavy lift drones can fly with up to 45 kg goods such as HX8 power XXL. This makes it sufficient for UAVs to carry a server and hover at specific places to collect and process the offloaded tasks. Recently, a prototype named SkyCore was built to support on-demand connectivity [58]. Network functions are softwarized and located in a single-board light-weight server, which can be directly deployed on DJI Matrice 600 Pro drones. The synchronization overhead of inter-UAV communication is reduced by segment-based routing with the label of the next tunnel segment tagged on the packets. Real-world experiment shows that mobile edge servers can not only provide timely services for certain hot-spot areas but also take advantage of their location flexibility to deal with the dynamic environment with negligible synchronization overhead.

Most existing work has focused on UAV trajectory planning in which the location of UEs remains unchanged and UAVs maintain a continuous flight among several fixed UEs [63–65]. The limitations of the existing work lie in 1) UE mobility which causes dynamic nonuniform tasks, and 2) network scalability, i.e., when a large number of UEs offload tasks, UAV trajectory will be affected by each individual UE, which is time-consuming and expensive to adjust. Observing the inefficiency of fixed edge server deployment and the limitation of the current UAV trajectory planning, we are motivated to investigate how to dispatch UAVs to appropriate hover locations among time-varying hot-spot areas and associate mobile UEs with mobile edge servers.

In this chapter, mobile-to-mobile edge computing is considered in which both UEs and edge servers can move around. UAV-mounted edge servers are employed for flexible edge services. Constrained by the limited computation capacity and commu-nication range, the edge server dispatching problem is formulated as a variable-sized bin-packing problem with geographic constraints, which is NP-hard [66]. An online mobile edge server dispatching scheme is proposed to determine the hover locations of the mobile edge servers sequentially. With the gradually increased communication radius, hot-spot areas are identified based on the task intensity. The performance of the proposed scheme is theoretically analyzed with the worst-case performance guar-antee. A hybrid scheme is also evaluated in which UAVs are dispatched to assist the

(41)

fixed BSs with task offloading. Simulation results show that the mobile dispatching scheme excels at handling dynamic nonuniformly distributed tasks and maintaining a good task latency fairness. When compared with the fixed server deployment, the number of served UEs increases 59% on average. The server utilization achieves 98% during the daytime. In addition, the hybrid scheme can satisfy even more demands while dispatching fewer UAVs with a better server utilization.

The rest of the chapter is organized as follows. The related work is introduced in Section 3.2. The motivation for deploying mobile edge servers is illustrated in Section 3.3. The communication model and the computation model are introduced in Section 3.4. In Section 3.5, the mobile dispatching problem is formulated and an online mobile edge server dispatching scheme is proposed with the performance guar-antee. The performance of the proposed algorithm and the impact of key parameters are evaluated and illustrated in Section 3.6. Section 3.7 presents the conclusion.

3.2 Related work

3.2.1 Fixed edge server placement

Most existing work in MEC assumes that edge servers are deployed following a cer-tain distribution such as the uniform distribution [52–54]. How to locate edge servers has been heavily studied, which plays an important role in improving the QoS. Li et al. [55] compared the performance of two different edge server deployment schemes, i.e., uniform distribution and nonuniform distribution based on UE density. Eval-uation results showed that UE distribution-aware server deployment can achieve a better performance than the uniform distribution. Facing dynamic crowds, Yin et al. [56] first used farthest point clustering (FPC) to group UEs and calculated the ideal locations of edge servers in each cluster by minimizing the total communication distance. Wang et al. [67] located the edge servers by solving the mixed integer pro-gramming problem. Lai et al. [68] deployed edge servers and maximized the number of served UEs through lexicographic goal programming. However, existing optimization problems are formulated based on selected BS locations among which edge servers are deployed.

Moreover, UE mobility can not only affect mobility management in mobile net-works but also influence the network workload dynamically. Ceselli et al. [69] located cloudlet facilities among the candidate locations and introduced VM migration to

(42)

re-balance the system. Locations of APs and cloudlets are determined based on the fixed locations of aggregation nodes.

Due to dynamic nonuniformly distributed tasks, the limitations of the fixed de-ployment are: 1) multi-hop communications are needed if the available edge servers are not close enough to mobile UEs, and 2) computation resources cannot be fully utilized at off-peak hours.

3.2.2 Mobile edge server trajectory planning

To tackle the limitations of the fixed server deployment, extensive efforts have been dedicated to deploying edge servers on UAVs. Considering the flexible movement of UAVs, Cheng et al. [60] designed the architecture of UAV-BS integrated mobile edge network for road safety scenarios. UAVs are dispatched to the area of interest and help process computation-intensive tasks. Zhou et al. [64] and Cheng et al. [65] de-termined the UAV trajectory by solving the mixed-integer non-convex problem with the objective of computation rate maximization and communication rate maximiza-tion, respectively. Similarly, Jeong et al. [63] jointly optimized the UAV trajectory as well as the bit allocation for both communication and computation purposes. The formulated energy consumption minimization problem is then solved by successive convex approximation. Hence, mobile edge servers take advantage of handling dy-namic nonuniform tasks and avoiding the waste of computation resources. However, most existing work does not consider the time and space-varying features of user tasks. Thus, this chapter investigates how to dispatch UAV-mounted edge servers to dynamic hot-spot areas.

3.3 Motivation

3.3.1 Tencent trace description

Real-time mobile request distribution is of great importance in MEC. It can not only provide us with the geographic information of a single mobile request but also a macroscopic view of the dynamic changes. Benefiting from the development of GPS-enabled devices, mobile UEs are offered geo-spatial and point of interest (POI)-related services.

(43)

col-lected geographic information when UEs are using its services1_{. According to the}

Tencent Big Data report, more than 1.3 billions of monthly active devices are using Tencent location-based applications, e.g., WeChat and QQ, in 2018 [70]. In RTUD traces, the device location (latitude, longitude, and region ID) and query time slot index are provided for each request. Then, the intensity of the geo-spatial requests can be derived from the traces. The time interval of the query time slot is 5 min.

3.3.2 Dynamic task requests

(a) 9:30 AM. (b) 3:30 PM. (c) 9:30 PM.

Figure 3.1: Tencent requests distribution on Oct. 1st, 2018 (a national holiday in China) at Happy Valley, Beijing.

Happy Valley, a theme park in Beijing, is selected as the focused scenario in this chapter due to the following features: 1) the size of the park is reasonable (1000 m × 500 m), i.e., neither too small that all requests can be covered by one BS and thus multiple UAVs are not needed, nor too large so that UAVs cannot reach specific locations on time, and 2) dynamic requests, i.e., UEs have obvious group effect and form different dense crowds with bursty requests over time. As shown in Fig. 3.1, dynamic requests are nonuniformly distributed in the park on Oct. 1st, 2018. UEs keep forming hot-spot areas at different places. In the morning, people gather at the entrance and then enter the park to take amusement rides. At night, people gradually exit the park and it is nearly empty.

As shown in Fig. 3.2, the locations of the fixed LTE BSs are mostly around the theme park2. In MEC, if the edge server is deployed at each BS, it will result in an unbalanced workload among BSs and fewer served UEs. When UEs offload tasks to edge servers, some BSs are fully utilized such as the ones located near the entrance

1

The geographic information of the mobile requests can be found at https://heat.qq.com/ heatmap.php.

(44)

Mobile edge servers UEs BS 4 BS 2 BS 1 BS 7 BS 6 BS 5 BS 3

Figure 3.2: System model of MEC with UAV-mounted edge servers.

(BS 1, BS 2 and BS 3). However, other BSs located along the road outside the park are underloaded and thus their computation resources are wasted. Even with a better server placement scheme, the fixed deployment scheme still faces the problem of computation resource inefficiency in a long run. In contrast, mobile edge servers can be dispatched to handle burst requests in hot-spot areas. With fewer requests, some UAVs can stay on while others can fly back to the warehouse for maintenance.

How to dispatch mobile edge servers in an effective and efficient way interests us most. Mobile edge servers can be dispatched closer to the crowds to serve UEs as much as possible, and meanwhile, take advantage of their flexibility to increase server utilization.

3.4 System model

In this section, the communication and computation models of MEC are introduced. The major notations used in this chapter are summarized in Table 3.1. As shown in Fig. 3.2, BSs and UAVs are all equipped with edge servers. Generally, UE m ∈ M (with |M| = M ) can offload its computing task to edge server n ∈ N (with |N | = N ) through either the fixed BSs or access points on UAVs. Tasks at the same location will be processed one by one. If no confusion arises, we will use m to denote tasks in the following statement instead of UEs.

(45)

Table 3.1: Definition and notation in Chapter 3.

Symbol

Definition and notation

N Set of mobile edge servers, and N = |N |

M Set of geo-spatial tasks, and M = |M|

(un, vn, 0) Coordinates of task m, m ∈ M

(um, vm, ι) Coordinates of mobile edge server n ∈ N , at height ι

dmn Distance between server n and task m

γ0 Reference SNR at a distance of 1 m

N0 Noise power

A Data size of each task

B(dmn) Data transmission rate which depends on the distance between mobile UEs

and edge servers

xmn

Binary variable: task m is served by mobile edge server

n (xmn= 1) or not (xmn= 0), m ∈ M, n ∈ N

Sn Set of tasks assigned to mobile edge server n ∈ N

rn Radius of the coverage of mobile edge server n ∈ N , rmin ≤ rn≤ rmax

∆r Increment of the radius r

P Computation intensity of each task

QE _{Computation capacity of a mobile edge server in cycles}

ϕ Latency deadline of each task

φ Total budget for mobile edge servers

3.4.1 Communication model

Let (un, vn, ι) and (um, vm, 0) denote the coordinates of mobile edge server n and task

m, respectively. Mobile servers are assumed to hover at the same height ι and can adjust the communication coverage through their antenna angle [71]. The distance between server n and task m can be calculated as

dmn = k(un, vn, ι) − (um, vm, 0)k

=p(un− um)2+ (vn− vm)2+ ι2,

(3.1) where k·k is the Euclidean norm.

The pathloss of UAV channel can be modeled as

P Lmn= P L(d0) + 10ςlog

dmn

d0

, m ∈ M, n ∈ N , (3.2)

(46)

non-LoS (NLoS) channel, and P L(d0) is the pathloss at the reference distance d0.

Details of UAV channel modeling can be found in [71]. The data transmission rate can be obtained by B(dmn) = B0log(1 + PTX_|H mn|2 N0 ), (3.3)

where B0 is the channel bandwidth, PTX is the transmission power of UEs, N0 is the

noise power, and Hmn is the total channel gain which consists of pathloss, the effect of

shadowing and the effect of multipath fast fading. Details can be found in Chapter 2.

3.4.2 Computation model

Let xmndenote whether task m is offloaded to edge server n (xmn= 1) or not (xmn=

0). As the computation capacity of each edge server is limited, if task m is offloaded to edge server n, the task latency DOffload

m consists of the time for communication

DComm.

m , i.e., sending the task and receiving the results, and the computation latency

DComp.

m , which can be calculated as

D_mOffload = DComm._m + D_mComp.

= A

B(dmn)

+ ω QE,

(3.4)

where A/B(dmn) is the communication latency, and B(dmn) is the data transmission

rate with the distance dmn. Assume that each task has the same data size A. This

work can be extended to variable-sized tasks as the tasks can be divided into small tasks equally. ω/QE _{is the queuing latency of task offloading, ω = P + P}0 _{is the}

computation workload P (in CPU cycles) of task m plus the current workload P0 of the assigned edge server n, and QE is the CPU processing capacity (in CPU cycle/s) of the edge server. Several virtual machines (VMs) are deployed at each edge server. If the number of offloaded tasks exceeds the number of VMs, tasks will be queued up and executed in an FCFS manner.

3.5 Online mobile server dispatching scheme

In this section, an efficient online UAV-mounted edge server dispatching scheme is introduced to provide mobile-to-mobile services. The optimization problem is first for-mulated as a variable-sized bin-packing problem with geographic constraints. Then,

Mobility management and mobile server dispatching in fixed-to-mobile and mobile-to-mobile edge computing

Contents

List of Tables

List of Figures

List of Abbreviations

Introduction

1.1

Mobile edge computing

1.1.1

MEC background

1.1.2

MEC scenarios and applications

1.1.3

Research topics in MEC

1.2

Research problems and contributions

1.2.1

Q-learning-based mobility management scheme

1.2.2

Online mobile edge server dispatching scheme

1.3

Thesis Outline

Chapter 2

Q-learning-based Mobility

Management under Uncertainties

for Mobile Edge Computing

2.1

Introduction

2.2

Related work

2.2.1