A Hybrid Framework for Condition-Based Maintenance under Imperfect Monitoring

(1)

A Hybrid Framework for Condition-Based

Maintenance under Imperfect Monitoring

Victor S. Ris

(2)

Master’s Thesis Econometrics, Operations Research and Actuarial Studies

Supervisor: Dr. B. de Jonge

(3)

A Hybrid Framework for Condition-Based Maintenance under

Imperfect Monitoring

Victor S. Ris

Abstract

Even though condition monitoring technology is rapidly developing, a system’s deterio-ration state cannot always be determined with certainty. Particularly in highly complex systems, algorithms might misclassify the condition state, potentially leading to subop-timal maintenance decision-making. In this study, we investigate condition-based main-tenance policies under imperfect continuous monitoring and imperfect failure root cause diagnosis. We formulate the problem as a partially observed Markov decision process and we solve for an optimal maintenance and inspection policy using the value iteration algorithm. We propose a framework that adaptively combines knowledge about the sys-tem’s stochastic deterioration process with imperfect condition monitoring information. A simulation study suggests that our hybrid approach is preferred over traditional ap-proaches. Considerable cost reduction is achieved even for low levels of state classification accuracy. We perform numerical experiments based on network quality degradation in a wireless telecommunications context to evaluate practical implications under various circumstances.

(4)

1. INTRODUCTION

1 Introduction

Continuous condition monitoring of operational systems is gaining significant attention in both in-dustry and academia. Recent developments might for a large part be attributed to the rise of the Internet of Things (IoT) and Industry 4.0, in which automated decision-making based on real-time sensor data plays a central role (Olsen and Tomlin, 2020). One of the applications for which contin-uous monitoring is particularly useful, is the field of condition-based maintenance (CBM). In CBM policies, maintenance decisions are based on the unit’s level of deterioration, as opposed to ‘tradi-tional’ time-based maintenance (TBM) approaches in which maintenance decision-making is purely based on the system’s age. Following De Jonge and Scarf (2020), we observe that research on CBM has developed rapidly over the past two decades. As a result, many theoretical and practical cases have been studied and associating maintenance policies have been proposed.

A major part of the academic contributions on CBM assumes that sensor data provide a perfect image of a unit’s condition. However, in many cases and for various reasons, it might be hard to obtain an indicator that reveals the unit’s true level of degradation with certainty or even high accuracy (Ghasemi et al., 2010). One of the explanations is that for both human and computer, it can be a challenge to interpret sensor data, particularly when multiple sensors together provide information about the unit’s condition. Machine learning (ML) and other statistical techniques have been employed to analyse multi-dimensional time series monitoring data in order to estimate an aggregated indicator for a unit’s deterioration level (Stetco et al., 2019). Although accuracy of these estimates is improving, there will always be some degree of uncertainty to be taken into account. Also, besides the fact that deterioration state classification might be imperfect, it can, by similar reasoning, be difficult to pinpoint the underlying root cause of degradation. In fact, uncertainty in failure diagnosis may eventually result in ineffective or inefficient repair actions (Pham and Wang, 1996). One can imagine that this might hold for data-driven automated root cause analysis, particularly if the repair action is performed without physical human inspection or interaction. For example, when an agent in an operating centre decides to reboot the system or modify its settings, without having enough information to be sure that those actions would actually overcome the failure root cause and thus improve the system’s condition.

(6)

inter-1. INTRODUCTION

pret for customer service agents. In fact, translation of complex network data from Wi-Fi systems into accurate identification of network failure root causes is not a straightforward task (Kakadia and Ramirez-Marquez, 2020). That is why ISPs are increasingly investing in finding innovative ways to automatically optimise the user’s quality of experience (QoE) and to minimise expenses by prevent-ing labour-intensive customer service calls and—much more expensive—physical visits by a network mechanic. In order to achieve such cost reductions, computations based on sensor data can pro-vide insights in the quality of the user’s internet connection that might support (semi-)automated improvement actions. Additionally, a source of information about the system’s condition—besides sensor data—might be prior knowledge about the general process of quality degradation in wireless communication networks. Based on historical data about the deterioration of networking systems, ISPs could make assumptions about underlying stochastic processes that drive degradation of the user’s QoE. Even more knowledge could be obtained by asking the consumer directly about her net-work quality. This would allow for a low-cost type of ‘inspection’ that immediately reveals the system’s true condition state. Combining information from these sources would potentially result in valuable insights that can support decision-making upon the ISP’s actions to improve users’ internet quality. Still, failure root cause diagnosis and therefore repair action effectiveness would remain imperfect even if more information would be available, which has to be accounted for in the decision model. Also, user QoE cannot be flawless all the time, because costs are associated with repair actions and unforeseen deterioration events will still occur. However, the ISP’s aim is to keep user satisfaction as high as possible in order to avoid costly churn and negative word-of-mouth related to bad network condition states.

(7)

2. LITERATURE REVIEW

optimal policy is found by applying the widely used value iteration algorithm (VIA). We evaluate and assess the added value of considering real-time condition monitoring data by simulation of a unit’s deterioration process. A numerical study shows that including continuous monitoring data as an additional source of information can reduce long-term maintenance costs considerably, even when deterioration state classification is imperfect.

The remainder of this study is organised as follows. In Section 2, we review existing literature in the field of CBM under various types of uncertainty as well as its relation to quality monitoring in wireless telecommunication networks. We provide a formal description of the considered problem setting in Section 3, to which a solution framework and approach is presented in Section 4. Next, in Section 5, we elaborate on the performance of the proposed approach and we conduct various numerical experiments on factors affecting maintenance policy performance in wireless networking systems. A conclusion of our study and a discussion on directions for future research is provided in Section 6.

2 Literature Review

In this review, we discuss existing studies related to the problem at hand. We show practical and academic relevance of the topic, as well as a gap in the current state of research that this study aims to address. In order to do so, we first consider previous literature on the application of operational research (OR) techniques in the field of telecommunication. Second, we review studies that investigate various methods for degradation state classification and failure diagnosis based on sensor data. Third, we discuss research developments on CBM with an emphasis on efforts assuming imperfect condition monitoring and imperfect repair.

(8)

2. LITERATURE REVIEW

gap on how [telecommunication] operators can automatically associate poor user experience, relevant network metrics and root causes with a suitable model that can be analyzed and optimized”. Their study combines multiple ML techniques for anomaly detection and failure root cause identification, so that network performance can be monitored automatically and actionable insights can be presented to network mechanics. Although this contribution promises high accuracy of the algorithm outputs, neither optimisation from a cost perspective nor the impact of potential misclassifications are being considered.

(9)

2. LITERATURE REVIEW

Ramirez-Marquez, 2020).

The more general problem of condition state classification and failure detection and prediction based on sensor monitoring data has been investigated also by OR scholars for a wide range of applications. The challenge lies particularly in obtaining valuable and interpretable information from multi-dimensional sensor data. For example, Dong and He (2007) employ a hidden semi-Markov model for multi-sensor equipment diagnosis and prognosis, while Zhang et al. (2015) focus on approaches to anomaly detection in high-dimensional monitoring data for failure detection in industrial systems. Specific applications for analysis of multi-sensor monitoring data in operations are, e.g., condition-based maintenance planning for railways (Rabatel et al., 2011) and transmission units for trucks (Kim et al., 2011; Makis et al., 2006), prediction of copper inductor erosion (Christer et al., 1997), and car engine reliability estimation (Das Chagas Moura et al., 2011). With respect to continuous condition monitoring, we find the analysis by Wang and Wang (2015) relevant to our study, since it combines imperfect condition monitoring with perfect manual inspections. The authors establish a delay-time model with two condition monitoring thresholds: one for performing inspection and one for performing preventive replacement.

Considering our review of the above literature, we observe that, to the best of our knowledge, the current state of research lacks an approach that fits some specific requirements and assumptions on decision-making for quality improvement in wireless telecommunications. Although various studies investigate data-driven state classification and failure diagnosis, only very few actually pay attention on how to utilise the obtained information efficiently. To address this issue in academic literature, we find that CBM optimisation approaches provide a suitable framework for decision support on how and when to take action in the customer’s network to ensure a sufficient level of QoE.

(10)

2. LITERATURE REVIEW

(11)

3. PROBLEM DESCRIPTION

3 Problem Description

We consider a continuously monitored single unit system that deteriorates stochastically according to a discrete-time Markov chain (DTMC) on a finite state space with known transition probability matrix P . The unit’s deterioration states are denoted by 1, 2, . . . , m, m + 1, where a unit in state 1 is as-good-as-new, m denotes the deterioration state before failure, and in state m + 1 the unit is failed. We let S = {1, 2, . . . , m, m + 1} denote the state space on which the unit deteriorates. We assume that continuous condition monitoring is imperfect and, therefore, that the unit’s true deterioration state is not directly observed. Instead, the decision-maker is aware of a probability distribution over the underlying deterioration states, referred to as the knowledge state (Maillart, 2006). We let θ = [θ1, . . . , θm, θm+1] denote the knowledge state in which θi ≥ 0 is the probability

that the system is currently in deterioration state i ∈ S. In addition, we assume availability of perfect inspections (IN) that can be performed to reveal the unit’s true deterioration state with certainty. The true deterioration state is also revealed after repair. We assume that failure is obvious and thus self-announcing. Hence, in an operational state s ∈ S0, where S0 is given by S \ {m + 1}, we always have θm+1= 0, while upon failure we have θm+1= 1.

In addition to historical knowledge about the unit’s stochastic deterioration process in combination with perfect inspections that can be performed to reveal the unit’s deterioration state, we suppose the availability of condition monitoring data ψ = [ψ1, . . . , ψm, ψm+1]. We assume that ψ is stochastically

related to the true deterioration state. Therefore, we let ψidenotes the probability that the unit is in

condition state i ∈ S, estimated based on the unit’s condition information that is provided by sensors. This monitoring information is imperfect in the sense that it can predict the true deterioration level at some known accuracy level η ∈ [0, 1]. Accuracy level η of the approximation can be interpreted as the average probability that ψ assigns to the true deterioration state. η can be determined based on T ∈ N historical (observed) deterioration states and associated monitoring information data. We let xtdenote the true deterioration state at time t and we let ψt= [ψt,1, . . . , ψt,1, ψt,m+1] denote the

monitoring information state at time t (t = 1, 2, . . . , T ). Then, η is the average of ψt,i given that

xt= i, with i ∈ S over all T observed time periods.

(12)

4. METHODOLOGY

probability qij. Consequently, we have thatPi_j=1qij = 1, qij ≥ 0 if j < i, and qij = 0 if j > i for all

i = 2, . . . , m. When PM is performed if i = 1, then j = 1 with certainty. We introduce an m × m maintenance effectiveness matrix Q containing probabilities qij as its (i, j)th element. It follows that

Q is of the form Q =   e1 0 Σ 0  ,

where Σ is a lower triangular (m − 1) × (m − 1)-dimensional matrix, 0 is a column vector of zeroes of length m − 1, and e1 is a row vector of zeroes of length m − 1 with 1 as it first element.

Given that the unit is in an operational state s ∈ S0, the cost of simply continuing operations without performing maintenance or inspection—i.e., by ‘doing nothing’ (DN)—is denoted by 0 ≤ c(θ) < 1, where the function c : [0, 1]m+1_{→ R}

+ is nondecreasing in θ. By making operational costs

knowledge state-dependent, we allow for incorporation of costs associated with a decrease in customer satisfaction or productivity as a result of degraded quality. We let the cost parameter for inspection be denoted by cin> 0, for performing preventive maintenance by cpm > 0, and for performing corrective

maintenance by ccm= 1, with cin≤ cpm≤ ccm= 1.

Our goal is to find an optimal CBM policy that minimises the long-run cost rate over an infinite time horizon.

4 Methodology

In this section, we describe how we propose to solve the problem as described in Section 3. We first elaborate in Section 4.1 on the unit’s stochastic deterioration process. Second, in Section 4.2, we propose a POMDP formulation of the model. Then, we apply the value iteration algorithm as a solution method for finding an optimal maintenance policy in Section 4.3. Finally, we present a simulation-based framework for the inclusion of continuous monitoring data in Section 4.4.

4.1 Deterioration Process

For our analysis, we assume the availability of sufficient historical deterioration data in order to make valid assumptions on the unit’s underlying stochastic deterioration process. As we indicated in previous sections, we consider—primarily for its general applicability and its analytical benefits—a unit that deteriorates according to a discrete-time Markov chain. We note that a DTMC is characterised by an (m + 1) × (m + 1) transition probability matrix P , and we assume that matrix P is known. The (i, j)th _{element of P is given by p}

(13)

4. METHODOLOGY

within one decision epoch. Obviously, we have that Pm+1

j=1 pij = 1 for all i ∈ S. Furthermore, in

maintenance optimisation literature, it is generally assumed that a unit’s condition cannot improve without performing a maintenance action, i.e., performing either PM or CM. This implies that the transition probability matrix P would be upper triangular, meaning that all entries below the main diagonal are equal to zero. In addition, as the failed state is absorbing, we know that pm+1,m+1= 1.

Hence, we can write

P =   Λ r 0 1  ,

with Λ an upper triangular m × m matrix, r a column vector of length m, and 0 a row vector with zeroes of length m.

There are various methods for computing (estimates for) transition matrix P . In De Jonge (2019), a method is proposed for discretising continuous-time continuous-state deterioration processes in order to obtain P . An explicit example is given for discretisation of a gamma process, which is, according to Van Noortwijk (2009), a good fit for gradually deteriorating units due to continuous usage and popular for its flexibility. Although the method as described in De Jonge (2019) is widely applicable and has several benefits, it requires an assumption about the probability distribution of the deterioration process, which may in some practical cases not possible or not desirable to assume. Alternatively, we could choose to estimate P based on (uncensored) historical monitoring data, without fitting these data to a specific probability distribution. For example, let us suppose that the unit’s true condition state xt is observed at times t = 1, 2, . . . , T for some finite T ∈ N. Also, we assume that transition

probabilities are stationary, i.e., pij remains the same for each time interval. Then, it is shown in

Anderson and Goodman (1957) that the maximum likelihood estimates ˆpij for pij can be computed

as ˆ pij = PT −1 t=1 1xt+1=j | xt=i Pm+1 j=1 PT −1 t=1 1xt+1=j | xt=i , i, j, = 1, . . . , m, m + 1, (1)

where 1B is the indicator function, which takes value 1 if B is true and 0 otherwise. As can be

observed, the approach for computing ˆpij in (1) is primarily based on counting all transitions from

state i to state j within a single time step and subsequently normalising the sum of the counts to 1 for each initial state i. This is an easy and generally applicable estimation approach in case there is a condition observation xt available for each t = 1, 2, . . . , T . If for some t observations are missing,

(14)

4. METHODOLOGY

4.2 POMDP Formulation

Markov decision processes are models for sequential decision-making when outcomes are uncertain. For a rigorous treatment on MDP models, we refer to Puterman (1994). Apart from maintenance, MDPs are widely applied in the field of operational research, notably on inventory optimisation, logistics, and manufacturing (De Jonge, 2018). Because in our study, we assume that a complete image of the unit’s true deterioration state is only known with certainty after performing inspection or maintenance, we can appropriately model this problem as a partially observable Markov decision process. In this subsection, we specify the POMDP’s decision epochs, states, actions, rewards (i.e., costs), and transition probabilities.

4.2.1 Decision Epochs

As we consider a discrete-time problem, we have a set of decision epochs T = {1, 2, . . . , N }. At each decision epoch t ∈ T , the decision-maker can choose an action from the set of available actions (see Section 4.2.3). Since we aim to find an optimal CBM policy that minimises the long-run cost rate, we consider an infinite time horizon, i.e., we select N = ∞.

4.2.2 System States

We use a probability distribution over the underlying deterioration states to represent the system state. This probability distribution is also known as the unit’s knowledge state (Maillart, 2006). It will be shown shortly that this information highly depends on the assumed stochastic deterioration process that defines transition matrix P . As introduced in Section 3, we let θ = [θ1, . . . , θm, θm+1]

denote a knowledge state in which θi ≥ 0 represents the probability that the system is currently in

deterioration state i, with i = 1, . . . , m, m + 1. The infinite knowledge state space is denoted by Ω and is given by Ω = ( θ ≥ 0 : m X i=1 θi= 1, θm+1= 0 ) .

Because system failures are obvious, we always have θm+1= 0 if the system is still working. Once a

failure happens, θm+1= 1.

(15)

4. METHODOLOGY

Similarly, the probability r(θ) that failure occurs—the so-called hazard function—can be computed as r(θ) = m X i=1 θipi,m+1= 1 − R(θ).

We let θ0 =hθ₁0, . . . , θ_m0 , θ_m+10 idenote the knowledge state at the next decision epoch, given that no failure occurs and that no inspection or maintenance is performed. We can write θ0(θ) as a function of the previous knowledge state θ and we observe that its ith _{element is given by}

θ0_i(θ) =        (θ · P )i R(θ) , i = 1, 2, . . . , m, 0, i = m + 1. (2)

Now, let ei denote an (m + 1)-dimensional vector of zeroes with 1 as the ith element. We note

that after performing maintenance or inspection, the true deterioration state is revealed and that, consequently, the unit’s knowledge state will always be θ = ei for some i ∈ S0. Then, following

Maillart (2006), if the decision-maker adheres to the optimal maintenance and inspection policy, we only need to solve the problem for a limited number of knowledge states θ ∈ Ω. In order to achieve this, we introduce the following notation. Let θi,j = hθi,j₁ , . . . , θ_mi,j, θ_m+1i,j i, i ∈ S, j = 0, 1, 2, . . . , denote the knowledge state given that deterioration state i was observed with certainty j decision epochs ago. This way of notation thus implies that θi,0= ei and that all other knowledge states can

iteratively be computed using Equation (2), i.e.,

θi,j= θ0 θi,j−1 , i = 1, 2, . . . , m, j = 0, 1, 2, . . . .

An important observation is it that is reasonable to assume that within some sufficiently large number of time periods J , there will under the optimal policy always either inspection or some maintenance action be performed (De Jonge, 2018). Following this reasoning, a finite subset Ω0_{⊂ Ω is constructed}

and is given by

Ω0=nθi,j, i = 1, 2, . . . , m, j = 0, 1, . . . , Jo.

(16)

4. METHODOLOGY

in decision-making.

Let us now suppose that at each decision epoch, system performance sensors provide condition information ψ = [ψ1, . . . , ψm, ψm+1], with 0 ≤ ψi ≤ 1 denoting the estimated probability that the

unit is in condition state i, with i = 1, 2, . . . , m. For example, this condition information could be the output from algorithms that classify the condition state based on (multiple) monitoring sensors. Returning to our application of wireless networking systems, monitoring information could refer to an MOS prediction algorithm that assigns probabilities ψi to all network quality states i ∈ S based

on QoS data. The accuracy of the information ψ might depend on the complexity of the system, as well as the sophisticatedness of the classification algorithm and the advancedness of the system’s performance sensors. Now, if we wish to include ψ in our maintenance decision-making, finding an optimal policy merely on the elements of Ω0 does not suffice. Maillart (2006) faces a similar problem, when a deteriorating system under imperfect inspections is considered. This can be tackled by choosing some integer M and use that to construct a set of points0, 1

M, 2

M, . . . , 1 from which probabilities ψi

are allowed to take values such thatPm+1

i=1 ψi= 1. Hence, we can introduce the finite set of knowledge

states Ω00⊂ Ω given by Ω00= ( j1 M, j2 M, . . . , jm M, 0

with ji∈ N ∪ {0} for all i = 1, 2, . . . , m : m X i=1 ji = M ) .

When the system is in an operational state, we have ψ ∈ Ω00 and if the system is failed, we have ψ = em+1. However, when we consider a knowledge state θ ∈ Ω00, a problem arises when θ is updated

and as a result we then obtain θ0(θ) /∈ Ω00_{. In order to overcome that issue, we can approximate θ}0_(θ)

by the vector ˜θ ∈ Ω00 that has the closest Euclidean distance to θ0(θ). That is to say,

˜

θ = arg min

ω∈Ω00

kθ0(θ) − ωk, (3)

where k · k denotes the Euclidean norm. It can be shown that at most the 2m−1 _{points ω ∈ Ω}00

surrounding θ0(θ) need to be considered (Maillart, 2006).

Finally, we remark that constructing such a finite knowledge state space is critical for us being able to apply the widely used value iteration algorithm for Markov decision processes as we discuss in Section 4.3.

4.2.3 Available Actions

At the beginning of each decision epoch, the decision-maker has to choose an action. When the system is in an operational state s ∈ S0_{, one can decide to do nothing (denoted by DN), to inspect the}

(17)

4. METHODOLOGY

the beginning of the time period, corrective maintenance (CM) must be performed. Since the action space is condition state-dependent, we define action space Aθas a function of knowledge state θ ∈ Ω00.

In fact, Aθ is given by Aθ=      {DN, IN, PM} , θ ∈ Ω00\ {em+1}, {CM} , θ = em+1.

Note that we have θ = em+1if and only if the unit is in the failed state.

4.2.4 Costs

Let c(θ, a) denote the cost (or reward) function that gives the immediate costs incurred in case action a ∈ Aθ is chosen in state θ ∈ Ω00. We first consider the cost if the decision-maker chooses

to do nothing. In that case, the system simply continues its operations. As we discussed earlier, we let these operational costs depend on deterioration state s, as deterioration of the system might negatively affect operational quality. However, since merely knowledge state θ is observed and s is not, we let cost of operations depend on θ instead of s. That is, we define the cost incurred if no maintenance of inspection is performed as the nondecreasing function c : [0, 1]m_{→ R}

+. For the sake

of simplicity, in this study, we consider c(θ) =Pm

k=1ck· θk for scalars 0 ≤ ck < 1, k ∈ S

0_{. Although}

not mandatory, it is logical to assume cj ≥ ck for j > k.

If an inspection is performed, inspection cost cinis incurred plus the operational costs associated

with the revealed state. That is, if inspection reveals deterioration state j, we have that the immediate inspection costs amount to cin+ c(ej). We know that state j is revealed with probability θj, so that

the expected cost of inspection equals cin +P m

j=1θj · c(ej). Next, since preventive maintenance

is imperfect, the unit’s state after performing preventive maintenance depends on probability qij.

Therefore, we have that the expected immediate cost for performing preventive maintenance is cpm+

Pm

i=1θiP m

j=1qij · c(ej). Lastly, if corrective maintenance needs to be performed, ccm + c(e1) is

incurred.

4.2.5 Transition Probabilities

(18)

4. METHODOLOGY

When the decision-maker chooses to perform an inspection at the beginning of a decision epoch, transition probabilities of the unit’s deterioration state in the next time period are according to transition probability matrix P . Furthermore, knowledge state θ is updated to ei with probability θi,

i ∈ S0. As the unit might continue deteriorating in the same decision epoch as the inspection takes place, the knowledge state at the beginning of the next period will be updated using Equations (2) and (3).

As we outlined in Section 3, performance of preventive maintenance might be imperfect, implying that the unit’s deterioration state after a preventive maintenance action need not be as-good-as-new, potentially as a result of imperfect failure diagnosis. In fact, we suppose that if a unit’s level of deterioration currently is i ∈ S0, preventive maintenance will bring the unit back to some less deteriorated state j < i with probability qij immediately. Therefore, when preventive maintenance is

chosen, the knowledge state at the next time period will be the approximation according to Equation (3) of θ0(ej) with probability qij for all j = 1, . . . , i, i = 2, . . . , m. If i = 1, the knowledge state at the

next time period will equal θ0(e1) with certainty.

In case the system is at the beginning of the decision epoch in the failed state, corrective mainte-nance is performed. Then, the unit goes back to the as-good-as-new state immediately with probability 1 and the knowledge state is updated to e1. Consequently, at the beginning of the next decision epoch,

the knowledge state will be the point ω ∈ Ω00 closest to θ0(e1).

4.3 Value Iteration

Now that we have fully specified the POMDP, we will show how an optimal maintenance policy can be determined. First, we define cost-to-go associated with the various actions. Thereafter, we propose the value iteration algorithm as our solution approach.

4.3.1 Cost-to-go

Let the value function vn(θ) denote the minimum total expected cost that will be incurred over the remaining n ∈ N decision epochs, given that the system’s current knowledge state is θ and given final rewards specified by v0_{. Below, we define for each action a ∈ A}

θ the so-called cost-to-go, i.e., the

total expected cost over all remaining time periods when action a is chosen in the current decision epoch. Note that as we study a model in which various outcomes are uncertain, we calculate expected total costs, which is fine under optimisation of the long-run cost rate. Furthermore, for simplicity of notation, we from now on consider the updated knowledge state θ0(θ) to be already approximated using the method in Equation (3), such that θ0(θ) ∈ Ω00.

(19)

4. METHODOLOGY

n decision epochs are left is given by

DN(n, θ) = R(θ)c(θ) + vn−1 θ0(θ)+ r(θ)ccm+ c(e1) + vn−1 θ0(e1)

.

The first term refers to the case where the unit is still operable in the next decision epoch, whereas the second term represents the expected total costs for the remaining periods if the unit fails.

Defining the cost-to-go when performing preventive maintenance is slightly more cumbersome. We first note that preventive maintenance will not be performed if θ = e1. Now, let us consider θ = ei,

with i = 2. Then, because performing PM will bring the unit’s knowledge state to e1, we have

PM(n, e2) = cpm+ vn(ei−1) = cpm+ vn(e1) = cpm+ DN(n, e1).

We know that vn_(e

1) = DN(n, e1), since—economically speaking—it does not make sense to perform

maintenance or inspection when the unit is in the as-good-as-new state with certainty. Furthermore, vn(ei) = minDN(n, ei), PM(n, ei) if i = 2, . . . , m, as by similar reasoning, inspection will not be

performed when the true deterioration state is known. So, for i = 3, we obtain

PM(n, e3) = cpm+ qi,i−1vn(ei−1) + qi,i−2vn(ei−2)

= cpm+ q3,2vn(e2) + q3,1vn(e1) = cpm+ q3,2min n DN(n, e2), PM(n, e2) o + q3,1DN(n, e1) = cpm+ q3,2min n DN(n, e2), cpm+ DN(n, e1) o + q3,1DN(n, e1).

It follows that in general, for θ = ei, the cost-to-go for preventive maintenance is defined as

PM(n, ei) = cpm+ m

X

j=1

qijvn(ej), i, = 1, 2, . . . , m.

Recall that by definition of Q, q1,1 = 1 and qij = 0 if j ≥ i, for i = 2, . . . , m and j ∈ S0. As

can be observed above, PM(n, ei) does not depend on itself and can be computed recursively for all

i = 2, . . . , m. Now, we can formulate the cost-to-go for performing PM for all θ ∈ Ω00. We note that performing preventive maintenance on a unit currently in knowledge state θ brings the unit to knowledge state ej with probability P

m

i=jθi· qij, j = 1, . . . , m − 1. Hence, for all θ ∈ Ω00, we have

PM(n, θ) = cpm+ m X i=1 θi i X j=1 qijvn(ej),

(20)

4. METHODOLOGY

In case an inspection is performed given that the knowledge state is θ, the cost-to-go follows easily by IN(n, θ) = cin+ m X i=1 θivn(ei).

Then, using the cost-to-go equations for the various actions given above, we define the optimality equation by

vn(θ) = minnDN(n, θ), IN(n, θ), PM(n, θ)o,

so that the expected total cost over the remaining time periods is minimised by choosing in each decision epoch the action with minimum cost-to-go.

4.3.2 Algorithm

The value iteration algorithm (VIA) is a commonly used method that employs a dynamic program-ming approach to solve MDPs (Chen et al., 2014). It finds a stationary ε-optimal policy as well as an approximation to the optimal gain g∗, which is interpreted as the average cost per time period (De Jonge, 2018). A sequence of valuesvn ∞_n=0on the space V of bounded real-valued functions on the state space Ω00is generated. That is, every v ∈ V is function of θ mapping from Ω00_{to R. The VIA} iteratively calculates value vn _{based on v}n−1_{, with v}0 _{∈ V pre-specified, for instance by v}0 _{= 0. In}

general, the sequencevn ∞_n=0 does not converge, so that a stopping criterion has to be determined. Algorithm 1 formally describes the procedure. We remark that parameter ε > 0 should be chosen small and that sp(v), denoting the span of v, is defined as

sp(v) = max

θ∈Ω00v(θ) − min_θ∈Ω00v(θ).

As discussed in De Jonge (2018), on termination of the algorithm, the tightest lower bound for the optimal cost per decision epoch g∗ is given by

min

θ∈Ω00

n

vn+1(θ) − vn(θ)o,

and a tightest upper bound by

max

θ∈Ω00

n

vn+1(θ) − vn(θ)o.

(21)

4. METHODOLOGY

Algorithm 1: The Value Iteration Algorithm

1. Select v0_{∈ V, specify ε > 0, and set n = 0;}

2. For each θ ∈ Ω00, compute vn+1_{(θ) by}

vn+1(θ) = minnDN(n + 1, θ), IN(n + 1, θ), PM(n + 1, θ)o, following the procedure described in Section 4.3.1;

3. if sp vn+1(θ) − vn(θ) < ε then go to Step 4;

else

increment n by 1 and return to Step 2;

4. For each θ ∈ Ω00, choose ε-optimal decision dεdefined by

dε(θ) ∈ arg min a∈Aθ

n

DN(n + 1, θ), IN(n + 1, θ), PM(n + 1, θ)o

and stop.

by the average of the upper and the lower bound. That is,

g∗≈1 2 min θ∈Ω00 n vn+1(θ) − vn(θ)o+ max θ∈Ω00 n vn+1(θ) − vn(θ)o .

On a final note, the value iteration algorithm will not terminate in case of periodic optimal solutions for any n < ∞. If such a situation occurs—although this is not the case in our study—we refer to Puterman (1994) for a transformation approach to ensure aperiodicity for all optimal policies.

4.4 Continuous Monitoring Data

So far, we have not considered condition monitoring data ψ as a part of the decision-making process. However, as we argued in previous sections, (imperfect) continuous monitoring is in many cases likely to be beneficial in determining optimal actions. In this section, we propose a framework for including condition monitoring in addition to knowledge on the unit’s stochastic deterioration process and inspections that reveal the unit’s true level of deterioration. This hybrid approach is based on the notion that due to uncertainties in the deterioration process, shortly after inspection or maintenance, the knowledge state is a more reliable estimation of the true deterioration state than after some decision epochs have elapsed.

Let us re-consider knowledge state notation θi,j∈ Ω0 _{with i ∈ S and j = 0, 1, 2, . . . as defined in}

Section 4.2.2. We introduce aggregated information state π constructed by

(22)

4. METHODOLOGY

with scalars αj, βj ∈ R+ chosen such that αj + βj = 1 for all j = 0, 1, 2, . . . , and approximated

according to Equation (3) to ensure that π ∈ Ω00. By doing so, the parameters αjand βjcorresponding

to knowledge states θi,j and monitoring data ψ may vary as the number of periods j since the last perfect observation increases. In this way, knowledge about the unit’s deterioration process and continuous monitoring information are simultaneously and adaptively incorporated in maintenance decision-making. For a clarification of this framework, let us consider Example 1.

Example 1. Consider a unit that deteriorates according to the assumptions in Section 3. Suppose that m = 2 and that transition matrix P is given by

P =      0.8 0.1 0.1 0 0.85 0.15 0 0 1      .

At decision epoch t = 1, it is observed that the system is as-good-as-new. Hence, its true condition xt

at time t = 1 is x1 = 1 and consequently, the system’s knowledge state is θ1,0 = e1 = [1, 0, 0]. The

monitoring information at time t = 1 is given by ψ = [0.9, 0.1, 0]. It makes sense to choose α0 = 1

and β0= 0, so that π = [1, 0, 0].

After some time periods have elapsed without failure, for instance when t = 6, the initial knowledge state θ1,0is updated according to Equation (2) several times, resulting in θ1,5= [0.59, 0.41, 0]. Suppose that x6= 2 (which is not observed) and monitoring data suggest ψ = [0.2, 0.8, 0]. Then, the

decision-maker would benefit from relying more on ψ than on θ1,5, implying that β5is likely to exceed α5. For

example, by selecting β5 = 0.85 and α5 = 0.15. This would result in π = 0.15 · θ1,5+ 0.85 · ψ =

[0.26, 0.74, 0]. Depending on the choice of M , π might be approximated using Equation (3).

After many periods without failure, for example at t = 30, we obtain θ1,29= [0.09, 0.91, 0] and we suppose that x30 = 2 and ψ = [0.15, 0.85, 0]. This final observation shows that good choices for αj

need under some circumstances not be decreasing in j. The reason is that after many periods without failure (so that j is large), it may very well be the case that θi,j provides in general more accurate information than ψ.

One of the benefits of the proposed framework is that the selection of αj and βj can be optimised

for all j = 0, 1, 2, . . . in a relatively straightforward way. Although various approaches are possible, let us consider the following simple one. Suppose that we have T historical observations of a unit’s true deterioration state xt at decision epochs t = 1, 2, . . . , T and that no inspections or preventive

(23)

4. METHODOLOGY

two consecutive actions (either inspection or maintenance) is always at most J periods. Then, we need to find values for αj and βj for all j = 0, 1, . . . , J . Now, let θjt =

h θj_t,1, . . . , θjt,m, θ j m+1 i denote the knowledge state at time t given that it is j decision epochs ago that maintenance or inspection was performed and recall from Section 3 that ψ_t denotes the monitoring information at time t. We introduce zt= [zt,1, . . . , zt,m, zt,m+1], where zt= ei if xt= i. ztcan thus be seen as the knowledge

state representing the true deterioration state at time t. Then, we can model the elements of zt by

using the modelling equations

zt,i= αjθjt,i+ βjψt,i+ ut,i for all t = 0, 1, . . . , T, each j = 0, 1, . . . , J, and every i ∈ S, (5)

with ut,i denoting the unobserved estimation residual at time t when modelling the ith element of zt.

We can model this problem as a constrained optimisation problem. Although the method is sensitive to outliers, an approach that is well-known and used commonly in econometrics for similar models is to minimise the sum of squared residuals (Hayashi, 2000). We let Zj denote the sum of squared

residuals for j = 0, 1, . . . , J as a function of αj and βj, which can be computed as

Zj(αj, βj) = m+1 X i=1 T X t=1 u2_t,i= m+1 X i=1 T X t=1 zt,i− αjθ j t,i− βjψt,i 2 ,

which obviously is nonlinear in the decision variables αj and βj. We can minimise the values of Zj

for all j under the affine restrictions that αj, βj ≥ 0 and that αj+ βj = 1. We thus face for every

j = 0, 1, . . . , J the following constrained optimisation problem.

Minimise Zj(αj, βj)

subject to αj+ βj = 1;

αj, βj ≥ 0.

(6)

Ultimately, by solving the nonlinear programming model in Equation (6), we find a pair of estimates for αj and βj that minimises the sum of squared residuals for the estimation of true level of deterioration

for each value of j. The obtained parameter estimates can be used in Equation (4) to adaptively compute the unit’s aggregated information state π, based on which maintenance decisions can be made.

(24)

5. NUMERICAL RESULTS

5 Numerical Results

In this section, we show practical applicability and relevance of the maintenance decision model at hand. We do this by conducting a numerical study related to optimisation of user quality of experience in wireless telecommunication networks. Furthermore, we perform several numerical experiments in order to examine behaviour of the model under various circumstances. First, we introduce a case study in Section 5.1, which forms the base case for our analysis. Then, in Section 5.2, we show the benefits of including continuous monitoring data in the information set on which decision-making is based. We conclude our numerical study with a range of experiments in Section 5.3 to provide practical implications of slight changes in modelling assumptions.

5.1 Case Study

As we discussed in the introduction, our numerical study is inspired by the problem of user quality of experience optimisation in wireless networking, specifically in domestic internet networks. We therefore let the deterioration states be represented by the user’s mean opinion scores. MOS is a five-point scale, where a user assigns value 1 to bad QoE and value 5 to excellent QoE. For ease of implementation in our model, we turn around this scale and we say that deterioration state 1 is associated with highest perceived quality and in deterioration state 5, the system is considered to be failed. Hence, we have m = 4 and S = {1, 2, 3, 4, 5}.

Inspections of the perceived network quality can be performed by asking the user directly how satisfied she is, so that the system’s true quality degradation state is revealed. Since asking this too often would annoy the user eventually, we assume that cost cin = 0.075 is incurred every time the

user is asked about her network QoE. The network operator or ISP—who is in this case the decision-maker—may perform potentially automated preventive actions to improve network quality at cost cpm = 0.25. If the network is in the failed state, it must be repaired immediately at higher cost

ccm = 1. When no maintenance or inspection is performed, knowledge state-dependent operational

cost c(θ) = Pm

k=1ck · θk for scalars 0 ≤ ck < 1, k ∈ S0 = {1, 2, 3, 4} is incurred. As network

quality decreases, customer churn is more likely to increase and customers tend to transfer negative messages (either spoken or written) to other consumers, which may affect the ISP’s reputation in an unfavourable manner. Therefore, we choose values of cksuch that ck is nondecreasing in k. We select

c1= 0, c2= 0, c3= 0.01, and c4= 0.025.

(25)

5. NUMERICAL RESULTS

following a compound Poisson process. According to Van Noortwijk (2009), the compound Poisson process is in general suitable for modelling deterioration as a result of sporadic shocks, as opposed to the Gamma process, which provides a better fit to more gradually deteriorating units under continuous use. Since wireless networking performance is particularly susceptible for sudden events, deterioration according to a compound Poisson process seems more realistic. For a detailed description of the compound Poisson process, we refer to Ross (2014) and for an example application, see Junca and S´anchez-Silva (2013). In our base case model, we assume that the inter-arrival times of shocks follow an exponential distribution with rate λ = 1.5. The sizes of the shocks are log-normally distributed with parameters µ = 2.25 and σ = 3. In order to obtain a nicely scaled deterioration process between levels 0 and 1, we divide the shock amount by a factor 104_{. For computation of discrete-time discrete-state}

data, we choose failure level L = 0.8 and divide the interval [0, L] in m equally sized intervals with length ∆x such that state k is associated with the interval [(k − 1)∆x, k∆x], for each k = 1, 2, . . . , m. This enables us to record the discrete system state at all discrete points t = 1, 2, . . . in time and we thus obtain the desired discrete-time discrete-state information that is required to compute P using Equation (1). Under the setting described above, based on N = 10 000 simulations of deterioration processes, we find P =             0.9044 0.0557 0.0131 0.0065 0.0204 0 0.8816 0.0757 0.0140 0.0287 0 0 0.8799 0.0774 0.0427 0 0 0 0.8784 0.1216 0 0 0 0 1             .

Finally, we select a maintenance effectiveness matrix Q that defines to what level the network quality will be restored after performing a preventive repair action. We choose

Q =         1 0 0 0 1 0 0 0 0.90 0.10 0 0 0.85 0.10 0.05 0         .

(26)

5. NUMERICAL RESULTS

computation times. As an illustration, setting M = 50 already results in |Ω00| = 23 426 knowledge states θ to be considered in total. Furthermore, we select ε = 10−3 and v0= 0 ∈ V.

An illustration of the optimal maintenance policy is given in Figure 1. For simplicity of repre-sentation and interpretation of the results, we provide four plots, each of which showing the optimal policy given that one of the elements θi is equal to zero for all i ∈ S0. Note that θm+1 is always zero.

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 θ2 θ3 Optimal action DN PM IN (a) θ1= 0. 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 θ1 θ3 Optimal action DN PM IN (b) θ2= 0. 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 θ1 θ2 Optimal action DN PM IN (c) θ3= 0. 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 θ1 θ2 Optimal action DN PM IN (d) θ4= 0.

Figure 1: Optimal policies under the restriction that θi= 0 for some specified i ∈ S0.

In the optimal policy presented in Figure 1, we observe several patterns. First, we find that if the system is not in the as-good-as-new state, i.e., θ1 = 0, in the majority of the cases, preventive

(27)

5. NUMERICAL RESULTS

performing preventive maintenance. Second, we see that if θ2= 0 or if θ3= 0, the information about

the system’s condition is scattered too much, which makes performing an inspection to reveal the true state attractive. Obviously, preventive maintenance will be performed in fewer cases if θ3 = 0 than

if θ2= 0, because in the latter case, the system is more likely to have a higher level of deterioration.

Third and final, we observe that if θ4= 0, the optimal decision is to do nothing in all possible cases.

That implies that the risk of failure when the system is not in state 4 is low enough to continue operations without further intervention.

We conclude this part of our analysis by mentioning that the approximated optimal long-run cost rate g∗ equals 0.0440. Implications of changes in maintenance effectiveness matrix Q, as well as in operational cost function c on the optimal policy and the corresponding cost rate are discussed in Section 5.3.

5.2 Inclusion of Continuous Monitoring Data

In this section, we implement the framework presented in Section 4.4 and we numerically assess the long-term cost benefits of including monitoring sensor information in determining optimal actions. First, based on the N simulations we ran in order to compute transition matrix P in Section 5.1, we generate for each observation xt (t = 1, 2, . . . , T ) an associated monitoring information state ψt at

time t. We propose a simple way to generate the elements ψt,i (i ∈ S) of ψt, namely by

ψt,i=              0, if i = m + 1, ξ, if i = xt, 1 − ξ m − 1, otherwise, (7)

for all t = 1, 2, . . . , T , where ξ is a random variable following a continuous uniform distribution on the closed interval [a, b], with a, b ∈ [0, 1] and a ≤ b. Stated differently, ξ ∼ U [a, b]. When we recall the definition of deterioration state classification accuracy level η (see Section 3), we know that η = E[ξ] = 1

2(a + b). In our numerical study, we select a = 0.65 and b = 0.85. Hence, we have η = 0.75,

which is in general a rather conservative choice for state classification accuracy based on monitoring data (Tamilselvan and Wang, 2013).

Then, by using Equation (7), we can construct ψtand we can compute θ j

t by using Equation (2)

based on the T available historical deterioration state observations xt for all j and transition matrix

P . Subsequently, we find estimates for parameters αj and βj in Equation (5) for all j, which enables

us to compute aggregated information state π as in Equation (4). A graphical representation of the estimation results for αj can be found in Figure 2.

(28)

5. NUMERICAL RESULTS

0

1

2

3

4

5

0.0

0.2

0.4

0.6

0.8

1.0 j

α

j

Figure 2: Estimation results for parameter αj.

certain observation, the predictional power of this observation is too little in comparison with the imperfect monitoring information. Therefore, for all j ≥ 3, we have αj = 0 and βj = 1. In line

with our hypothesis, α0 is (nearly) equal to 1, because for j = 0, θjt provides a perfect image of the

system’s degradation state. Between j = 0 and j = 3, a linear combination of knowledge state θj_t and monitoring information ψ_tyields most accurate estimations for the system’s unobserved deterioration state xt.

Now that we have found parameter estimates, we can evaluate the cost benefits resulting from the inclusion of imperfect condition monitoring data. In order to do so, we simulate T = 75 000 time steps of a deteriorating unit. For each time period t we compute θj_t, generate ψ_t, and linearly combine these two vectors using the estimates for αj and βj to obtain πt. Then, based on πt, an

action is chosen according to the optimal policy found in Section 5.1. Thereafter, the corresponding immediately incurred cost is calculated and θj_t and πtare updated depending on the optimal decision.

We find that under our assumptions and parameter settings, the resulting simulated optimal cost per time period equals 0.0420. If we compare this result with our previous value iteration finding that g∗ equals 0.0440—which was based on maintenance decisions where condition monitoring information was not involved—we conclude that the hybrid approach proposed in this study results in considerable cost reduction. In fact, in this particular case, a cost benefit of approximately 4.6% is achieved. We further note that if solely monitoring data would be used for decision-making (i.e., αj = 0 and

βj = 1 for all j), we obtain a cost rate equal to 0.0429. Hence, our findings suggest that if both

(29)

5. NUMERICAL RESULTS

high computation times, it is too expensive to provide confidence intervals for the presented simulation results. Therefore, we only report point estimates for simulated long-run cost rates in this study.

In Section 5.3.3, we discuss the implications of lower and higher monitoring information quality by varying the value of classification accuracy level η.

5.3 Experimental Insights

Given the computational results on the performance of our solution approach, we can now investigate the implications of varying assumptions regarding model parameters. By performing such numer-ical experiments, we learn about the solution’s behaviour under different practnumer-ical settings. First, we investigate improvements in failure diagnosis and maintenance effectiveness and how these im-provements affect the optimal maintenance and inspection policy. Second, we discuss the influence of state-dependent operational costs. Third and final, we evaluate the effects of being able to classify the system’s deterioration state at lower and higher accuracy.

5.3.1 Failure Diagnosis Accuracy

Similar to state classification accuracy, quality of failure root cause diagnosis may also vary among specific algorithms and practical circumstances. Improved root cause diagnosis generally leads to higher maintenance effectiveness, which is in this study reflected by matrix Q. We compare the optimal maintenance policy when failure diagnosis is less accurate than the benchmark case in Section 5.1. Additionally, we consider perfect maintenance. Note that the optimal policy is not affected by inclusion of condition monitoring information ψ. Hence, we can apply the value iteration algorithm and assess the outcomes, albeit now for a different choice of Q. In this experiment, we specify maintenance effectiveness matrices Q1 and Q2by

Q₁=         1 0 0 0 1 0 0 0 0.7 0.3 0 0 0.6 0.3 0.1 0         , Q₂=         1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0         ,

where Q₁reflects relatively low maintenance effectiveness and Q₂corresponds to perfect maintenance. It turns out that the optimal approximated long-run cost rate under low maintenance effectiveness (i.e., Q1 is considered) is equal to 0.0448. Under perfect maintenance, the cost rate equals 0.0434.

(30)

5. NUMERICAL RESULTS

in the base case. Namely, as it is highly uncertain whether preventive maintenance would recover the system’s state, it can be more cost-efficient to wait and accept the risk of failure instead of potentially repeatedly performing an imperfect maintenance action. Hence, the benefit of revealing the deterioration state through an inspection decreases as well and simply waiting might be wisest. Under perfect maintenance, the significance of differences with the benchmark maintenance policy is too little to discuss any further. We provide a (partial) representation of the optimal maintenane policies in Figures 3 and 4 in Appendix A.

Using such a sensitivity analysis, companies can decide whether it is worthwile or not to strive for more advanced methods for failure root cause diagnosis. Based on our results, investing in such methods seems only beneficial if cost of development and implementation is low.

5.3.2 Operational Costs

Due to customer (dis)satisfaction, operational cost are assumed to be increasing in the system’s level of quality degradation. In contrast to the marginal implications of varying the level of maintenance effectiveness, changes in operational cost parameters heavily affect the optimal maintenance policy and corresponding long-run cost rate. We consider two cases. In the first case, state-dependent cost parameters are chosen high. Second, we consider the case in which state-dependent cost do not exist.

We recall from Section 4.2.4 that the operational cost function is given by c(θ) = Pm

i=1ci· θi.

For the first case, we select c1 = 0, c2= 0.1, c3= 0.3, and c4= 0.5. Evidently, these relatively high

parameter values are reflected in the approximated long-run cost rate, which is in this case equal to 0.0499. When no operational costs are incurred, i.e., ci = 0 for all i = 1, 2, . . . , m, the cost rate equals

0.0399. Compared with benchmark cost rate g∗ = 0.0440, deviations are rather large. Considerable differences are also found in the optimal maintenance policies, for which we refer to Figures 5 and 6 in Appendix A. First, we observe that under high state-dependent operational costs, the option to do nothing has become less favourable. Now that there is an immediate ‘penalty’ on moving to a high level of deterioration, it is even more essential to keep the unit close to as-good-as-new. Hence, preformance of preventive maintenance is promoted. The same holds for inspections, as those can ensure that appropriate maintenance actions can be performed as soon as a unit’s condition is too close to the deteriorated state. Obviously, the reverse applies to the case where operational costs are not incurred at all. That is, doing nothing is more often preferred, because there is less immediate harm from a cost perspective in moving to a higher level of degradation.

(31)

5. NUMERICAL RESULTS

of state-dependent operational costs and to utilise the obtained insights to act adequately. Moreover, this result emphasises the importance of being able to quantify the direct effects of degraded perceived quality.

5.3.3 State Classification Accuracy

As we remarked in Section 4.2.2, condition state classification accuracy may depend on several factors. In relation to wireless network quality, it is found to be challenging to correctly predict the user’s QoE due to complexity of the monitoring data. Therefore, we earlier considered a relatively low value for accuracy parameter η in our analysis, so that the obtained results are valid on a general level. We recall from Equation (7) that the level of classification accuracy is defined by the probability distribution of ξ, a uniformly distributed random variable that is used to construct monitoring information ψ. Previously, we selected parameters a = 0.65 and b = 0.85, such that η = E[ξ] = 0.75. Now, by varying the values of a and b, we can investigate how the cost rate is affected by different levels of classification accuracy. Table 1 summarises computational results for the long-run cost rate based on N = 75 000 simulated time steps under a wide range of choices for a and b. We also present the value of ˜J , denoting the number of time periods since the last true observation as of which weight αj equals

0. For example, as can be inferred from Figure 2, we have ˜J = 3 in our benchmark case study.

Table 1: Long-run cost rate for various values of parameters a and b.

η

a

b

Cost rate

J

˜

0.55

0.5

0.6 0.0428

6

0.65

0.6

0.7 0.0424

4

0.75

0.55

0.95 0.0421

4

0.75

0.65

0.85 0.0420

3

0.75

0.7

0.8 0.0420

3

0.85

0.8

0.9 0.0416

2

0.95

0.9

1 0.0415

1

Let us begin our analysis of the results in Table 1 with the observation that, as expected, long-run cost rate and ˜J are both decreasing in the level of state classification accuracy η. Secondly, we observe that the cost rate difference between η = 0.65 and η = 0.75 is small. Hence, it is questionable whether a company should strive to close this gap in state classification accuracy. However, estimation that achieves high deterioration level prediction (i.e., η = 0.85 or η = 0.95) does yield considerable cost reduction. In that case, it is most efficient to base maintenance decisions on monitoring data solely, as only shortly after reveal of the true deterioration level xt, knowledge state θ is of value. Hence, we

have ˜J = 2 and ˜J = 1, respectively.

(32)

6. CONCLUSION

0.75. Simulation results suggest that those effects are rather small. If the interval [a, b] is chosen too narrow, condition state classification cannot benefit from high levels of accuracy that a large value for b can offer. However, large differences between a and b seem somewhat less favourable. We argue that if a = 0.55 and b = 0.95, decisions will in some cases be made under high uncertainty, potentially leading to unnecessary maintenance actions at an early stage of the deterioration process. For example, if the realisation of uniform random variable ξ is close to lower bound a, knowledge about the level of deterioration will be highly scattered and expensive conservative actions may be taken. That also seems the reason why ˜J is slightly higher than for the other two settings where η = 0.75. This behaviour of the solution is arguably a direct consequence of the way we construct monitoring information ψ in this study. Because, as outlined in Equation (7), when a unit’s true deterioration state is 1, it is assumed that the probabilities are spread evenly among states 2, 3, . . . , m. Hence, according to ψ, the probability that the system is in state 2 is the same as the probability that the system is in state m. We observe in Figure 1, that if there is a positive probability that the system is in state m = 4, preventive maintenance is in many cases the optimal decision. So, by nature of the construction of monitoring information ψ, wide intervals [a, b] are not desired in itself.

6 Conclusion

We studied condition-based maintenance policies under imperfect condition monitoring. A prob-lem was considered in which maintenance performance is imperfect and operational costs are state-dependent. Since the system’s deterioration state cannot always be completely identified, we formu-lated the problem as a partially observed Markov decision process. Inspections are allowed in order to reveal the unit’s true level of deterioration. The model is solved for an optimal maintenance pol-icy using the value iteration algorithm. Whereas traditional approaches base optimal maintenance decision-making on either knowledge about the system’s deterioration process or on (imperfect) con-tinuous monitoring data, we proposed a framework that combines these two sources of information in a hybrid manner. Our approach builds on the notion that shortly after complete observation of the true level of deterioration, the observed information is more reliable than imperfect continuous monitoring data. When some time has passed, however, monitoring information might be more informative than the observation that was made a number of time periods earlier due to uncertainties in the deterio-ration process. A computational study showed that the proposed framework yields promising results. For various accuracy levels of the condition monitoring information, considerable cost reductions were achieved in comparison with existing approaches.

(33)

6. CONCLUSION

effectiveness does not lead to very large cost reductions. In contrast, it turned out that varying operational costs considerably affects both optimal mainenance policy and long-run cost rate. This observation underpins the importance of being aware of the immediate effects of degraded quality and being able to quantify those effects. Finally, it was shown that in order to achieve significant cost reductions, it is necessary to aim at high levels of deterioration state classification accuracy.

We related the problem at hand with network quality degradation in a wireless telecommunications context. Network performance data can be used to predict the customer’s quality of experience level, so that internet service provider agents might perform (automated) actions to improve degraded perceived quality. Due to the complex nature of network monitoring data, errors in both condition state classification and failure root cause diagnosis should be considered in decision-making. The proposed framework suits this problem well as it could assist network operators in optimally deciding when and how to act in case of bad or uncertain network quality. Furthermore, the obtained insights are useful for companies both within and beyond the telecommunications industry to assess the value of potential investments in sophisticated condition monitoring systems. Aditionally, our modelling approach can be used to evaluate the minimum required level of condition state classification accuracy to be beneficial to consider in maintenance decision-making.

(34)

REFERENCES

References

Adickes, M.D., R.E. Billo, B.A. Norman, S. Banerjee, B.O. Nnaji, and J.A. Rajgopal (2002). Opti-mization of indoor wireless communication network layouts. IIE Transactions 34 (9), 823–836. Amaldi, E., S. Bosio, F. Malucelli, and D. Yuan (2011). Solving nonlinear covering problems arising

in WLAN design. Operations Research 59 (1), 173–187.

Anderson, T.W. and L.A. Goodman (1957). Statistical inference about Markov chains. The Annals of Mathematical Statistics 28, 89–110.

Ben Letaifa, A. (2019). WBQoEMS: Web browsing QoE monitoring system based on prediction algorithms. International Journal of Communication Systems 32 (13), e4007.

Boutaba, R., M.A. Salahuddin, N. Limam, S. Ayoubi, N. Shahriar, F. Estrada-Solano, and O.M. Caicedo (2018). A comprehensive survey on machine learning for networking: evolution, applica-tions and research opportunities. Journal of Internet Services and Applicaapplica-tions 9 (16), 1–99. Boz, E., B. Finley, A. Oulasvirta, K. Kilkki, and J. Manner (2019). Mobile QoE prediction in the

field. Pervasive and Mobile Computing 59, 101039.

Chaudhry, A.U., J.W. Chinneck, and R.H.M. Hafez (2016). Fast heuristics for the frequency chan-nel assignment problem in multi-hop wireless networks. European Journal of Operational Re-search 251 (3), 771–782.

Chen, M., H. Fan, C. Hu, and D. Zhou (2014). Maintaining partially observed systems with imperfect observation and resource constraint. IEEE Transactions on Reliability 63 (4), 881–890.

Christer, A.H., W. Wang, and J.M. Sharp (1997). A state space condition monitoring model for furnace erosion prediction and replacement. European Journal of Operational Research 101 (1), 1–14.

Da Hora, D., A.S. Asrese, V. Christophides, R. Teixeira, and D. Rossi (2018). Narrowing the gap between QoS metrics and web QoE using above-the-fold metrics. In International Conference on Passive and Active Network Measurement, pp. 31–43. Springer.

Das Chagas Moura, M., E. Zio, I. D. Lins, and E. Droguett (2011). Failure and reliability predic-tion by support vector machines regression of time series data. Reliability Engineering & System Safety 96 (11), 1527–1534.

(35)

REFERENCES

De Jonge, B. (2019). Discretizing continuous-time continuous-state deterioration processes, with an application to condition-based maintenance optimization. Reliability Engineering & System Safety 188, 1–5.

De Jonge, B. and P.A. Scarf (2020). A review on maintenance optimization. European Journal of Operational Research 285 (3), 805–824.

De Jonge, B., R.H. Teunter, and T. Tinga (2017). The influence of practical factors on the benefits of condition-based maintenance over time-based maintenance. Reliability Engineering & System Safety 158, 21–30.

Delage, E., L.G. Gianoli, and S. Brunilde (2018). A practicable robust counterpart formulation for decomposable functions: A network congestion case study. Operations Research 66 (2), 535–567. Dong, M. and D. He (2007). Hidden semi-Markov model-based methodology for multi-sensor

equip-ment health diagnosis and prognosis. European Journal of Operational Research 178 (3), 858–878.

Fiedler, M., T. Hoßfeld, and P. Tran-Gia (2010). A generic quantitative relationship between quality of experience and quality of service. IEEE Network 24 (2), 36–41.

Fildes, R. and V. Kumar (2002). Telecommunications demand forecasting—A review. International Journal of Forecasting 18 (4), 489–522.

Ghasemi, A., S. Yacout, and M-S. Ouali (2010). Parameter estimation methods for condition-based maintenance with indirect observations. IEEE Transactions on Reliability 59 (2), 426–439.

Hayashi, F. (2000). Econometrics. Princeton University Press.

Hoßfeld, T., P.E. Heegaard, M. Varela, and S. M¨oller (2016). QoE beyond the MOS: An in-depth look at QoE via better metrics and their relation to MOS. Quality and User Experience 1 (1), 2. Junca, M. and M. S´anchez-Silva (2013). Optimal maintenance policy for a compound Poisson shock

model. IEEE Transactions on Reliability 62 (1), 66–72.

Kakadia, D. and J.E. Ramirez-Marquez (2020). Quantitative approaches for optimization of user experience based on network resilience for wireless service provider networks. Reliability Engineering & System Safety 193, 106606.

(36)

REFERENCES

Kanuparthy, P., C. Dovrolis, K. Papagiannaki, S. Seshan, and P. Steenkiste (2012). Can user-level probing detect and diagnose common home-WLAN pathologies? ACM SIGCOMM Computer Communication Review 42 (1), 7–15.

Kim, M.J., R. Jiang, V. Makis, and C.-G. Lee (2011). Optimal Bayesian fault prediction scheme for a partially observable system subject to random failure. European Journal of Operational Research 214 (2), 331–339.

Kurt, M. and J.P. Kharoufeh (2010). Optimally maintaining a Markovian deteriorating system with limited imperfect repairs. European Journal of Operational Research 205 (2), 368–380.

Maillart, L.M. (2006). Maintenance policies for systems with condition monitoring and obvious fail-ures. IIE Transactions 38 (6), 463–475.

Makis, V., J. Wu, and Y. Gao (2006). An application of DPCA to oil data for CBM modeling. European Journal of Operational Research 174 (1), 112–123.

Mitra, K., A. Zaslavsky, and C. ˚Ahlund (2013). Context-aware QoE modelling, measurement, and prediction in mobile computing systems. IEEE Transactions on Mobile Computing 14 (5), 920–936. Olsen, T.L. and B. Tomlin (2020). Industry 4.0: Opportunities and challenges for operations

man-agement. Manufacturing & Service Operations Management 22 (1), 113–122.

Pei, C., Y. Zhao, G. Chen, R. Tang, Y. Meng, M. Ma, K. Ling, and D. Pei (2016). WiFi can be the weakest link of round trip network latency in the wild. In The 35th _{Annual IEEE International}

Conference on Computer Communications, pp. 1–9. IEEE.

Pham, H. and H. Wang (1996). Imperfect maintenance. European Journal of Operational Re-search 94 (3), 425–438.

Puterman, M.L. (1994). Markov decision processes. John Wiley & Sons.

Rabatel, J., S. Bringay, and P. Poncelet (2011). Anomaly detection in monitoring sensor data for preventive maintenance. Expert Systems with Applications 38 (6), 7003–7015.

Ross, S.M. (2014). Introduction to probability models. Academic Press.

Saverimoutou, A., B. Mathieu, and S. Vaton (2019). A 6-month analysis of factors impacting web browsing quality for QoE prediction. Computer Networks 164, 106905.

(37)

REFERENCES

Sherlaw-Johnson, C., S. Gallivan, and J. Burridge (1995). Estimating a Markov transition matrix from observational data. Journal of the Operational Research Society 46 (3), 405–410.

Stetco, A., F. Dinmohammadi, X. Zhao, V. Robu, D. Flynn, M. Barnes, J. Keane, and G. Nenadic (2019). Machine learning methods for wind turbine condition monitoring: A review. Renewable Energy 133, 620–635.

Syrigos, I., N. Sakellariou, S. Keranidis, and T. Korakis (2019). On the employment of machine learning techniques for troubleshooting WiFi networks. In The 16th _{Annual IEEE Consumer}

Com-munications & Networking Conference (CCNC), pp. 1–6. IEEE.

Tamilselvan, P. and P. Wang (2013). Failure diagnosis using deep belief learning based health state classification. Reliability Engineering & System Safety 115, 124–135.

Trevisan, M., I. Drago, and M. Mellia (2019). PAIN: A passive web performance indicator for ISPs. Computer Networks 149, 115–126.

Van Noortwijk, J.M. (2009). A survey of the application of gamma processes in maintenance. Relia-bility Engineering & System Safety 94 (1), 2–21.

Verbeke, W., K. Dejaeger, D. Martens, J. Hur, and B. Baesens (2012). New insights into churn prediction in the telecommunication sector: A profit driven data mining approach. European Journal of Operational Research 218 (1), 211–229.

Wang, W. and H. Wang (2015). Preventive replacement for systems with condition monitoring and additional manual inspections. European Journal of Operational Research 247 (2), 459–471. Zhang, F., J. Shen, and Y. Ma (2020). Optimal maintenance policy considering imperfect repairs

and non-constant probabilities of inspection errors. Reliability Engineering & System Safety 193, 106615.

A Hybrid Framework for Condition-Based Maintenance under Imperfect Monitoring

A Hybrid Framework for Condition-Based

Maintenance under Imperfect Monitoring

Victor S. Ris

Master’s Thesis Econometrics, Operations Research and Actuarial Studies

Supervisor: Dr. B. de Jonge

A Hybrid Framework for Condition-Based Maintenance under

Imperfect Monitoring

Victor S. Ris

CONTENTS

Contents

1. INTRODUCTION

1

Introduction

inter-1. INTRODUCTION

2. LITERATURE REVIEW

2

Literature Review

2. LITERATURE REVIEW

2. LITERATURE REVIEW

2. LITERATURE REVIEW

3. PROBLEM DESCRIPTION

3

Problem Description

4. METHODOLOGY

4

Methodology

4.1

Deterioration Process

4. METHODOLOGY

4. METHODOLOGY

4.2

POMDP Formulation

4.2.1

Decision Epochs

4.2.2

System States

4. METHODOLOGY

4. METHODOLOGY

4.2.3

Available Actions

4. METHODOLOGY

4.2.4

Costs

4.2.5

Transition Probabilities

4. METHODOLOGY

4.3

Value Iteration

4.3.1

Cost-to-go

4. METHODOLOGY

4. METHODOLOGY

4.3.2

Algorithm

4. METHODOLOGY

Algorithm 1: The Value Iteration Algorithm

4.4

Continuous Monitoring Data

4. METHODOLOGY

4. METHODOLOGY

5. NUMERICAL RESULTS

5

Numerical Results

5.1

Case Study

5. NUMERICAL RESULTS

5. NUMERICAL RESULTS

5. NUMERICAL RESULTS

5.2

Inclusion of Continuous Monitoring Data

5. NUMERICAL RESULTS

0

1

2

3

4

5

0.0

0.2