Optimal dynamic aperiodic inspection and maintenance scheduling for single-unit systems with continuous deterioration processes

(1)

Optimal dynamic aperiodic inspection and maintenance

scheduling for single-unit systems with continuous deterioration

processes

N. Kremers

(2)

(3)

Optimal dynamic aperiodic inspection and maintenance scheduling

for single-unit systems with continuous deterioration processes

Niek Kremers

August 31, 2017

Abstract

Inspection- and maintenance scheduling is increasingly seen as a cost saver for companies, rather than a ‘necessary evil’, only needed when a unit fails. By optimally planning inspections, either periodic or aperiodic, based on the known deterioration level, and furthermore by optimally performing maintenance, costs can be significantly decreased. This thesis considers a single unit deteriorating continuously over time. A discretization approach is defined in order to transform the continuous-state, continuous-time deterioration process into a discrete-state, discrete-time deterioration process. Using this approach, a Markov Decision Problem (MDP) is formulated, which is thereafter optimally solved using algorithms. The performance of four different policies will be examined, with (a)periodic inspections, and preventive maintenance either at an inspection or possibly delayed. The difference in performance of the policies will be tested, applying them to the MDP, and comparing them by their average long term costs. Both policies with possibly delayed maintenance actions perform best, with the aperiodic policy outperforming all other policies. Finally, the optimal action as a function of the observed state is derived for both aperiodic policies, as these are least restricted and therefore most interesting.

1 Introduction

Nowadays, maintenance is of increasing importance for many companies. Whereas in the past it was seen as a ‘necessary evil’, companies now see the importance and possibilities of cost reduction accompanying this process. In many industries, it therefore takes an increasingly larger fraction of the total budget (De Jonge et al., 2017).

An overview and literature review of various maintenance applications is given by Dekker (1996). Al-though the article is somewhat older, it gives a nice representation of maintenance through the years. He mentions the first scientific approaches to maintenance management dating from the 1950s and 1960s, from where it has become a growing research area. In this period, the main methodology that was used for maintenance optimization problems was renewal theory. One of the first was Mercer (1961), who considered not only time, but also the usage of a mechanism in constructing a maintenance policy.

There are two different types of maintenance to be distinguished, preventive and corrective maintenance (PM and CM). Corrective maintenance has to be applied when the unit, or machine, fails and has to be repaired, whereas PM is applied in order to postpone, or ideally prevent, failure. In most cases, preventive maintenance is less expensive, and has less severe consequences than corrective maintenance. PM is therefore generally preferred over CM. Furthermore, PM is favoured, as it is better to plan in advance. Altogether, the goal of a maintenance study is to make a comparative assessment between risk of failure and the cost of PM, as performing PM too often could become costly as well.

(4)

can waste a substantial remaining lifetime of the unit. The main reason that nowadays we see a trend of CBM being favoured is that monitoring the condition of a unit becomes less expensive due to technological improvements over the years.

In CBM models there is a difference between continuous monitoring and inspections. When the first method is applied, as the name suggests, one knows the true status of equipment, at all times. This method is, however, mostly too expensive to implement. Hence, for many units, inspections have to be performed in order to get to know the current condition of the unit. Within inspection scheduling one can distinguish between periodic and aperiodic inspections. When aperiodic inspection scheduling is adopted, the next inspection time is based on the correctly observed deterioration state, whereas with periodic inspection scheduling, inspections have to be performed at certain time intervals.

In this thesis we consider a single unit that deteriorates continuously over time and for which inspections are required to reveal the current deterioration level. We aim to determine the optimal maintenance policy with (a)periodic inspections based on a Markov decision process (MDP) formulation of the model. In order to do so, we develop a discretization approach to discretize both time and condition levels of the unit. Furthermore, by applying this discretization approach, we transform the continuous-state, continuous-time deterioration process into a discrete-state, discrete-time deterioration process. Applying the discretization approach allows us to apply algorithms to optimally solve the model, rather than applying heuristics on the continuous deterioration process. We consider four different policies, with (a)periodic inspections and PM either only allowed at an inspection or also possibly delayed. Ultimately we will formulate the optimal action as a function of the correctly observed deterioration level.

The remainder of this thesis is constructed as follows. In Section 2 we carry out a literature review on relevant articles considering heuristics or algorithms for (a)periodic inspection scheduling. Next, in Section 3, the discretization approach is introduced and explained, accompanied by an example of the special case of a unit deteriorating according to a gamma process. Then, in Section 4, the four policies will be defined, as well as the MDP, on which the discretization approach will be applied. This section furthermore presents the value iteration algorithm, which will be used to solve the MDP. Thereafter, Section 5 gives numerical examples of the applications of the policies. Finally, this thesis will be concluded in Section 6, where we besides give suggestions for future research.

2 Literature review

In this section, we review recent articles on continuous-state, continuous-time deterioration processes, where the state is only observed through (imperfect) inspections. These articles apply heuristics in order to solve the problem. They consider the time until the next inspection and sometimes also the threshold for PM to be decision variables.

An example is Grall et al. (2002), who consider perfect inspections, on a continuously deteriorating system. Failures are immediately detected, after which CM has to be applied directly. They choose the aperiodic inspection times and the critical threshold for PM. The consideration they have to make is between a low threshold, which prevents full exploitation of remaining lifetime and a high threshold, which keeps the system working with large risk of failure. A multi-level control-limit rule of stochastically deteriorating systems is used to test the performance and minimize the expected long-term costs.

In the same year, Grall, Roussignol et al. (2002) made another mathematical maintenance cost model, in order to optimize the inspection scheduling and the replacement threshold for a single unit. In this collaboration between four authors, failure was only detected at an inspection, and downtime costs were incurred for the time the system was failed. As CM is mostly costly, a PM policy is said to be profitable, as failure is avoided and the system availability and safety is improved. The maintenance cost model considered aims to find the optimal balance between costs and benefits of the maintenance strategy. They jointly optimizes the threshold and inspection schedule, where they allow aperiodic inspections.

(5)

threshold. CM has to be performed if the system is observed to be in the failed state, implying that failure is not self-announcing. During failure of the system, downtime costs, or production delay is incurred, both disadvantageous for cost minimization. Aperiodic inspections are allowed, and the inspection scheduling function has to be optimized, jointly with the critical threshold, minimizing global system costs. They consider a linear inspection function, and show numerical optimization of the cost criterion.

Another example is Castanier et al. (2003), who consider aperiodic inspections, and downtime costs, as failure is only detected at an inspection. They consider partial repair, which only reduces the deterioration level, besides as-good-as-new repair. Partial repair can be profitable as it takes less time and costs less than perfect repair or replacement. A CBM decision framework is used using the knowledge of the system state, and an associated cost model is developed in order to optimize the performance of the maintenance policy. They show that Markov renewal stochastic techniques are efficient tools to solve the maintenance model analytically, whereas before, one had to resort to Monte-Carlo simulations.

Maillart (2006) considers a Markov decision process for which the true state is only observed through inspections, modelled as a partially observable Markov decision process (POMDP). The inspections that are considered can be either perfect, or imperfect, and both cases are described. PM has to be performed on a multi-state Markovian deterioration system, when the observed deterioration level exceeds a threshold. Fail-ures are self-announcing and the failed system has to be maintained (CM) immediately. Several experiments are numerically analyzed, minimizing the long term cost-rate.

Golmakani and Fattahipour (2011) also consider a CBM model, and apply an age-based inspection schedule, where the system’s age is a leading factor in determining the next inspection. Besides the costs for PM and CM, inspection costs are taken into account and the goal is to minimize the total average costs of replacement (maintenance) and inspection, using a control-limit policy. The age-based schedule means that in early age of the machine or unit, inspection intervals will be longer, whereas the inspection intervals shorten if time passes, the unit ages, and failure becomes more likely. The costs due to the higher frequency in inspections will increase, but as it gives more certainty in preventing failure, it prevents the costs of CM. Flage et al. (2012) consider an inspection model in order to observe the condition, given the deterioration process. However they also consider the case where the parameters of the deterioration process itself are uncertain. Inspections are then not to monitor the condition, but rather the change in condition in a certain time period. That is, the deterioration within a period, rather than the actual total deterioration level. A Bayesian model is applied in order to obtain more information on the deterioration process, which in turn can be used to better define the model and ultimately can be used to minimize the expected long-run costs per unit of time.

A more recent article is that of Berrade et al. (2013), who considers imperfect inspection. An alarm is followed by a check of the validity (at additional costs), and thereafter decisions are made to retire the unit or continue service. Another policy they consider is when a false positive inspection leads to the conclusion of failure and maintenance performed. The inspections may be aperiodically, rather than at pre-set times.

Furthermore, Do et al. (2015) consider CBM, where there are both perfect and imperfect maintenance actions. The former restores the unit perfectly, whereas the latter to any state between the current and the as-good-as-new state. They consider the gamma process in order to describe the deterioration of the unit. An adaptive maintenance policy is proposed, where the aperiodic inspection schedule is based on the expected useful lifetime of the unit, which is used to provide maintenance planning with a certain reliability level.

(6)

3 Discretizing continuous-state continuous-time deterioration

processes

In this section, we introduce an approach to approximate stationary continuous-state, continuous-time non-decreasing deterioration processes X(t) by discrete-time Markov chains (DTMC). An example of such a continuous-state, continuous-time process is the stationary gamma process, which is widely used to model continuous deterioration of equipment. We elaborate on the gamma process in Section 3.1, and apply the discretization approach to this process in Section 3.2. We assume that the deterioration level is 0 when the unit is as-good-as-new, and we partition the state space into intervals with length ∆x. The kth interval, denoted xk, is thus given by xk= [(k − 1)∆x, k∆x], k = 1, 2, . . . .

We will also discretize time and denote the length of the time steps by ∆t. The length of the time steps obviously influences the probabilities of jumping between deterioration intervals, since for a small time step, the unit can be expected to deteriorate less compared to a larger time step. All combined, the deterioration process X(t), the discrete states with length ∆x and the time periods with length ∆t, are the inputs of this discretization approach. For the moment, we assume an infinite number of states and time periods.

Using the inputs mentioned above, the goal is to find probabilities of moving from state k (interval xk)

to state k + m (interval xk+m). Because the considered deterioration processes are non-decreasing, we can

only go from state k to state k + m if m ≥ 0. Furthermore, because we consider stationary deterioration processes, these probabilities will only depend on the value of m and not on the current state k.

Considering a given interval, in order to evaluate probabilities corresponding to the DTMC, observe that the position in this interval is important, as the probability of leaving this interval is dependent on the current position. There is a higher probability of leaving when we are at a relatively high level within the interval, compared to being at a lower level within this interval. However, a property of a Markov chain is that the probabilities of moving to states only depend on the current state. As an approximation, when the current state of the DTMC is k, we assume that the exact deterioration level of the underlying continuous deterioration process is uniformly distributed on the interval xk= [(k − 1)∆x, k∆x]. Thus, when

the deterioration level is in a certain interval, we assume that it is equally likely to be anywhere within that interval. By letting X(t) denote the deterioration level at time t, we have

{X(t)|X(t) ∈ xk} ∼ Unif(xk).

Furthermore, we let f (x; t) denote the density function of the additional amount of deterioration during a time interval of length t, and let F (x; t) denote the corresponding distribution function. Because we consider stationary deterioration processes, we have that X(τ ) − X(t) ∼ f (x; τ − t) for all τ > t ≥ 0.

When the exact current deterioration level is x ∈ xk, we are still in the same interval after the period

if the additional amount of deterioration during the next time period does not exceed k∆x − x. Because x ∼ unif(xk), the probability that next time period we are in the same deterioration interval is given by

P {X(t + ∆t) ∈ xk|X(t) ∼ Unif(xk)} = Z k∆x (k−1)∆x 1 ∆x Z k∆x−x 0 f (y; ∆t)dydx = 1 ∆x Z ∆x 0 Z ∆x−x 0 f (y; ∆t)dydx = 1 ∆x Z ∆x 0 F (y; ∆t) ∆x−x y=0 dx = 1 ∆x Z ∆x 0 F (∆x − x; ∆t)dx. (1)

Where the latter equality holds because F (0) = 0.

The other possibility is that the current deterioration interval is left, and that we thus move to one of the subsequent intervals. When we move from a deterioration level x within interval xk to a deterioration level

(7)

between m · ∆x − x and (m + 1) · ∆x − x. The probability of a jump from interval xkto interval xk+m, m ≥ 1, is given by P {X(t + ∆t) ∈ xk+m|X(t) ∼ Unif(xk)} = Z k∆x (k−1)∆x 1 ∆x Z (m+1)∆x−x m∆x−x f (y; ∆t)dydx = 1 ∆x Z ∆x 0 F (y; ∆t) (m+1)∆x−x m∆x−x dx = 1 ∆x Z ∆x 0 F (m + 1)∆x − x; ∆t − F m∆x − x; ∆t dx. (2) For most realistic deterioration processes, such as the gamma process, the integrals in equations (1) and (2) cannot be analyzed algebraically, and we therefore need to resort to numerical analysis.

3.1 Gamma process

In this section we apply the discretization approach to the stationary gamma process, one of the most commonly used stochastic processes to model continuous deterioration processes (Van Noortwijk, 2009). Van Noortwijk (2009) wrote an overview article, considering different stochastic deterioration processes. His main focus, however, is on the application of the gamma process to model deterioration. This process is suitable to model a stochastic, monotonic deterioration process, since it has an infinite number of independent and non-negative increments in any time period. One of the first considering this process in order to model deterioration is Abdel-Hameed (1975), after which numerous other authors applied the gamma process. Since the gamma process is widely accepted, an commonly used to model deterioration, the process will be implied in this section.

The gamma process is based on the gamma distribution. The density function of this distribution is given by fα,β(x) = 1 βα_Γ(α)x α−1_exp −x β , where Γ(α) denotes the gamma function, given by

Γ(α) = Z ∞

0

zα−1exp{−z}dz.

The stationary gamma process has a shape function α(t) = at (with a > 0), thus dependent on the length of the time interval, and scale parameter b > 0. The gamma process has the following properties (Van Noortwijk et al., 2007):

1. X(0) = 0 with probability 1,

2. X(τ ) − X(t) ∼ fa(τ −t),b for τ > t ≥ 0,

3. X(t) has independent increments,

4. it is a jump process with infinitely many jumps in any time interval.

Note that the gamma process we consider in this thesis will have shape function α(t) = a∆t, as the time steps we deal with have length ∆t. Furthermore, denote the distribution function corresponding to this gamma process by Fa∆t,b(x). Substituting this all into equations (1) and (2), the probability of staying in

the current interval for a unit deteriorating according to this gamma process equals

(8)

and that the probability of jumping from interval xk to interval xk+m, m ≥ 1, equals PX(t + ∆t) ∈ xk+m|X(t) ∼ Unif(xk) = 1 ∆x Z ∆x 0 Fa∆t,b (m + 1)∆x − x − Fa∆t,b m∆x − xdx. (4)

As mentioned before, the above expressions that cannot be evaluated algebraically. We thus have to resort to numerical analysis.

3.2 Applying discretization

In the remainder of this study, we will use discrete-time Markov chains, in order to approximate continuous-state, continuous-time deterioration processes. Until now, we assumed an infinite number of states. When we consider a deteriorating system, however, we typically consider a deterioration level L which represents the failure level of the unit. All deterioration levels larger than L then represent that the unit is in the failed state. We subdivide the deterioration levels between 0 and L into n equally sized intervals with lengths ∆x = L/n. All remaining intervals will be combined into a single (n + 1)th interval, representing the failed state.

The probabilities of jumping from one state to another can then be represented by a transition probability matrix P , where entry P [i, j] denotes the probability of jumping from state i to state j in a time period of length ∆t.

We will calculate the elements of the transition probability matrix iteratively, as only the number of states we jump matters in the calculations, refer to equations (1) and (2). The first row of the matrix, excluding the probability of failure, can be calculated as

P [1, j] =      1 ∆x R∆x 0 F (∆x − x; ∆t)dx, if j = 1, 1 ∆x R∆x 0 F j∆x − x; ∆t − F (j − 1)∆x − x; ∆t dx, if 2 ≤ j ≤ n.

The probabilities in the other rows, again without the probability of failure, follow easily from the first row. We start at the second row, i = 2, and continue until the final running state, i = n, which gives

P [i, j] = (

0, if 1 ≤ j < i ≤ n,

P [i − 1, j − 1], if 2 ≤ i ≤ j ≤ n.

Probabilities of jumping backwards are obviously set to zero, as the deterioration processes considered are non-decreasing. The resulting transition probability matrix will hence be upper triangular.

The probabilities of going to the failed state are calculated by using that the sum of probabilities should add up to one. The probabilities in the final column of the matrix can thus be calculated as

P [i, n + 1] = (

1 −Pn

j=1P [i, j], if 1 ≤ i ≤ n,

1, if i = n + 1.

We conclude this section with an example of a transition probability matrix based on a continuous deterioration process. We assume a gamma deterioration process with parameters a = 5, and b = 0.2 and a failure level L = 1. We approximate this continuous process by a discrete process with n = 4 deterioration states and a fifth failed state, which implies that ∆x = 1/4. Furthermore we consider time steps with length ∆t = 0.1. The probability matrix P is then equal to

(9)

4 Inspection and maintenance optimization

In this section, we will define the four different inspection- and maintenance policies that will be applied. Furthermore, we formally define the inspection and maintenance problem that we consider, for the single unit deteriorating continuously over time. We will formulate the Markov decision process (MDP) that we will use to solve this problem.

This section is split up into multiple parts, and starts by defining four different inspection and maintenance policies that we consider. After this, the mathematical problem and components of the MDP will be defined. Thereafter, the algorithm used to solve the DTMC will be introduced. This algorithm will be used in order to test the performance of the discretization approach and the four different policies.

4.1 The policies

We will consider four different policies:

− PI/PMI: Periodic Inspections with PM only possible at an Inspection. − PI/PMD: Periodic Inspections with possibly Delayed PM.

− AI/PMI: Aperiodic Inspections with PM only possible at an Inspection. − AI/PMD: Aperiodic Inspection with possibly Delayed PM.

Altogether, it is thus of interest what can be gained by (i) aperiodic inspections, (ii) delayed PM actions, and (iii) by combining them. Note that some policies are more restricted than others. For instance, the PI policies restrict inspections to be performed at certain points in time, whereas the AI policies allow inspections at every time interval. Furthermore, some policies only allow a combination of PM and an inspection, whereas other policies allow PM delayed. The MDP on which these policies will be applied, and possible actions for every policy, will be formally defined in Section 4.3.

Applying the first and second policy, we deal with periodic inspections. This means that inspections should be performed every T time units. The difference between the two is thus that the PI/PMD policy allows delayed PM, which means that PM might be scheduled in between inspections, contrary to the limitation to perform PM only after an inspection in the PI/PMI policy. For both policies it holds that at the occurrence of a failure, immediate CM has to be performed and the inspection schedule will be reset, meaning that the next inspection will be performed after T time units. The parameter T , the inspection interval, should be optimized for both policies.

For the AI/PMI and AI/PMD policy, we consider aperiodic inspections. This means that rather than a pre-set number of time intervals between inspections, the time of the next inspection is chosen depending on the observed deterioration level of the unit. The next inspection time may thus be different for all observed states. This gives a larger feasible region for the algorithm, which is the reason that we expect the AI/PMI and AI/PMD policy to outperform the PI/PMI and PI/PMD policy, respectively. Again there is the difference whether or not PM is allowed to be performed between inspections, or only after an inspection, where the policy with delayed PM possible is expected to perform better.

Comparison between the PI/PMI and AI/PMI policy, or the PI/PMD and AI/PMD policy, will give insights into the benefits of aperiodic inspection schedule, compared to a periodic inspection schedule. Com-parison between the PI/PMI and PI/PMD policy, or the AI/PMI and AI/PMD policy, gives insights into the benefits of the possibility of delayed PM, contrary to the obligation to perform PM only after an inspection. The AI/PMD policy, with aperiodic inspections and possible delayed PM, is the least restricted policy, and is therefore beforehand expected to perform best.

4.2 Problem formulation

(10)

the discretization approach of Section 3. We let deterioration state 1 represent the ‘new’ state, state m the most worn state, and state m + 1 the failed state.

We assume that a failure is self-announcing, and when this occurs, corrective maintenance (CM) should be carried out immediately. CM is assumed to take a negligible amount of time and will bring the unit back to the as-good-as-new state. The associated cost is denoted by ccm.

All other deterioration states can only be observed by performing an inspection, which can be performed at the start of a time period. For policy PI/PMI and policy PI/PMD, an inspection is obliged every T time periods, where the inspection interval T has to be optimally chosen. The cost of an inspection is denoted by ci and it is assumed to take a negligible amount of time.

Besides an inspection, preventive maintenance can be performed at the start of the period at a cost cpm.

Note that for the PI/PMI and AI/PMI policies, a PM action is only allowed following an inspection, whereas for the PI/PMD and AI/PMD policies, PM might be performed delayed. Preventive maintenance is also assumed to take a negligible amount of time and repairs the unit to the as-good-as-new state. Note that due to the negligibility of time both actions take, it is possible to inspect and perform PM at the start of the same time period. Furthermore, due to the fact that both maintenance actions take a negligible amount of time, a unit can both be maintained and deteriorate within the same period.

The unit will deteriorate according to an (m+1)×(m+1) transition probability matrix for the discretized deterioration process based on the underlying continuous-state, continuous-time deterioration process and time steps of length ∆t. Refer to Section 3.2 for details on how this matrix is constructed. When a unit is maintained, it deteriorates according to this matrix starting at the as-good-as-new state. Given an inspection, but no maintenance action, the unit deteriorates from the observed state according to the matrix.

We will test different policies, considering (a)periodic inspections and either PM only possible after an inspection, or possibly delayed PM. The differences in performance will be of interest. They show us the benefits of aperiodic inspections as opposed to periodic inspections, and the benefits of delayed preventive maintenance options as opposed to only PM at inspections.

4.3 Markov decision process formulation

Without an inspection or performing PM, there is uncertainty about the current state. which implies that we deal with a partially observable Markov decision process (POMDP). This is a special case of a Markov decision process (MDP), for which we refer to Puterman (2014), who considers and describes the MDP components in detail.

In our POMDP setting, we let T = {1, 2, . . . , M } denote the set of decision epochs. As we deal with an infinite time horizon, M = ∞ and the goal will be to minimize the mean cost per unit of time. The POMDP in this thesis consists of the following components:

− Since the Markov process is only partially observable, we do not know its true state at all time. Hence we introduce a set of knowledge states, or belief states. Let πi,j_{∈ R}m+1 _{denote the knowledge state if}

deterioration level i was observed at the last inspection, j time periods ago, i = 1, . . . , m, j = 0, . . . , M . The knowledge states are individually defined as

πi,j = [π₁i,j, πi,j₂ , . . . , π_m+1i,j ] ∈ Ω, i = 1, . . . , m, j = 0, 1, . . . , M,

where πi,j_k ≥ 0 denotes the probability that the system is in deterioration state k, for k = 1, . . . , m + 1. Again i gives the last observed deterioration state of the unit, and j the number of time periods since this last inspection.

The set of all knowledge states is given by Ω = ( π ≥ 0 : m X k=1 πk = 1 ) .

(11)

Note that for knowledge states πi,0, i = 1, . . . , m, an inspection is just performed and we know the state with certainty. The knowledge state is then defined as

π_ki,0= (

1, if i = k, 0, if i 6= k.

The number of knowledge states is infinite. However, we can reasonably provide an upper bound N on the number of time periods between consecutive inspections and maintenance actions. This allows us to formulate an MDP with a finite state space Ω0.

For i = 1 . . . , m, j = 1, . . . , N , the knowledge states can be determined recursively ˜

πi,j =πi,j−1P, i = 1, . . . , m + 1, j = 1, 2, . . . , N.

The probability that the system is in one of the deterioration states 1, . . . , m must be rescaled, however, since we are never in the failed knowledge state, i.e. πi,j_m+1= 0, i = 1, . . . , m, j = 1, . . . , N . The other probabilities are thus divided by 1 − ˜π_m+1i,j and the new knowledge states are given by

πi,j = π˜ i,j 1 1 − ˜πi,j_m+1, . . . , ˜ πi,j m 1 − ˜π_m+1i,j , 0 ! .

If the current knowledge state is πi,j _{and the action ‘Do Nothing’ is chosen, the probability of}

failure, qi,j_{, on the next transition equals}

qi,j=

n

X

k=1

π_ki,jPk,m+1, i = 1, . . . , m, j = 0, . . . , N,

that is, the probability of being in a given state k, given knowledge state πi,j multiplied by the probability of failure when being in state k, k = 1, . . . , m.

− A set of available actions given the knowledge state, A(πi,j_{) = {DN, IN, P M }, where ‘DN’, ‘IN’}

and ‘PM’ denote the actions ‘Do Nothing’, ‘Inspection’ and ‘Preventive Maintenance’, respectively. Corresponding immediate costs of these actions are 0, ci, and cpm, respectively.

Note that any action besides Doing Nothing in the observed as-good-as-new state, π1,0_{, will be}

costly and as a result, the action DN will the cost minimizing choice. The action IN does not need to be taken into account when we are in knowledge states πi,0_{, i = 1, . . . , m, as we know the deterioration}

state with certainty. An inspection will then not give any extra information, but will be costly, and is therefore not profitable.

In all other knowledge states πi,j_{, a subset of the action space will be possible, depending on the}

policy. As the actions IN and PM both take a negligible amount of time, the actions might be chosen immediately after each other in the same time period. They are therefore not needed to be combined within a single action.

As a failure is assumed to be self-announcing, and CM then has to be performed immediately, it is not considered in the action space. Instead, the costs ccm will immediately be incurred when the

system enters the failed state and immediate CM has to be performed.

A necessary assumption is that preventive maintenance is less costly than corrective maintenance, cpm< ccm. This assumption is needed, since otherwise the best option will always be to do nothing

until the unit fails, making both inspection and the action preventive maintenance obsolete. In a mathematical setting, the action space can be defined for each policy as

A(πi,0) = (

{DN }, if i = 1,

(12)

which holds for all policies, as all actions besides DN in the as-good-as-new state, π1,0, are costly, and no inspection is allowed if we are at knowledge states πi,0, i = 2, . . . , m.

Then, for all policies separately, the action space for j > 0 is defined as

AP I/P M I(πi,j) = ( {IN }, if i = 1, . . . , m, j = T, {DN }, elsewhere, AP I/P M D(πi,j) = ( {IN, P M }, if i = 1, . . . , m, j = T, {DN, P M }, elsewhere, AAI/P M I(πi,j) = n {DN, IN }, i = 1, . . . , m, j = 1, . . . , N, AAI/P M D(πi,j) = n {DN, IN, P M }, i = 1, . . . , m, j = 1, . . . , N.

Observe that for both periodic policies, PI/PMI and PI/PMD, an inspection has to be performed if the time since the last inspection equals T (PI/PMD does also allow a PM action at this point). The inspection interval, T , has to be optimized in order to obtain the optimal periodic inspection schedule. Furthermore observe that the PI/PMI and AI/PMI policies do not allow PM actions to be chosen besides at j = 0, as PM may only be performed at an inspection. The AI/PMD policy can be observed as most general, as all actions are possible for every knowledge state.

4.4 Value Iteration Algorithm

In this section, we describe the value iteration algorithm, a common used method to solve POMDPs. This algorithm proves to be very efficient in solving POMDPs. We will give a formal definition of the algorithm, as it will be applied in this thesis.

First of all, let the value vn(πi,j) denote the minimum total expected cost, as function of the knowledge state πi,j and with n periods left. The final costs, as function of the final knowledge state, will be v0(πi,j). The value iteration algorithm generates a sequence of values v0, v1, . . . , ∈ V , where V represents a function from the finite knowledge state space Ω0 _{to the real numbers R.}

Recall that the probability of failure for knowledge state πi,jgiven that the action chosen is DN will be denoted qi,j _{and is calculated as}

qi,j=

m

X

k=1

π_ki,jPk,m+1,

i.e. the probability of being in state k, k = 1, . . . , m, given knowledge state πi,j, i = 1, . . . , m, j = 1, . . . , N , multiplied by the probability of failure given we are at state k, k = 1, . . . , m.

Before the algorithm itself is introduced, we will explain how we calculate the value vnbased on the value vn−1. These new values, vn are calculated iteratively, using the previously calculated values, vn−1 and the costs to go, which are given by

DN (πi,j, n) = qi,j(ccm+ vn−1(π1,0)) + (1 − qi,j)vn−1(πi,j+1),

P M (πi,j, n) = cpm+ vn(π1,0), for i = 2, . . . , m and j = 0, and for i = 1, . . . , m and j = 1, 2, . . . , N,

IN (πi,j, n) = ci+ m

X

l=1

π_li,jvn(πl,0), for i = 1, . . . , m, j = 1, 2, . . . , N.

As was already mentioned in Section 4.3, the only action possible in the as-good-as-new state, π1,0_{, is DN,}

and the only actions allowed just after an inspection, in knowledge states πi,0_{, i = 2, . . . , m, are DN and}

PM. As the costs-to-go corresponding to DN are based on previously calculated values vn−1_(πi,j_{), these costs}

should be calculated first. Thereafter, P M (πi,0_{, n) should be calculated, as this reflects on v}n_(π1,0_{), which}

(13)

state either DN or PM is chosen as optimal action. The costs to go should hence be calculated iteratively, in the order DN, PM, IN.

The cost to go represent the costs given the action, DN, IN and PM, in an obvious notation. Note that implicitly, the cost are specified such that infeasible actions are not allowed. These actions are PM when the unit is just inspected and observed to be in the as-good-as-new state, or an inspection when the system just has been inspected.

Using the costs-to-go, the value function is defined as the minimum costs given available actions vn(π1,0) = DN (π1,0, n),

vn(πi,0) = min

a∈AP OL(πi,j)

DN (πi,0_{, n), P M (π}i,0_{, n) ,} _{for i = 2, . . . , m,}

vn(πi,j) = min

a∈AP OL(πi,j)

DN (πi,j_{, n), IN (π}i,j_{, n), P M (π}i,j_{, n) , for i = 1, . . . , m, j = 1, . . . , N.}

Note that the number of knowledge states, N , has to be chosen such that an action different from DN is optimal before j = N , as we do want to know the optimal action for every knowledge state πi,j_{, i =}

1, . . . , m, j = 1, . . . , N .

One can observe that only DN is allowed in the as-good-as-new state, π1,0_{, and IN is not allowed when}

an inspection has just been performed and the current deterioration is thus known with certainty. The cost minimizing action in every value iteration will be the minimum of the allowed actions, a ∈ AP OL(πi,j),

i = 1, . . . , m, j = 1 . . . , N , depending on which policy is applied. Refer to Section 4.1 for the specification of the policies and to Section 4.3 for details on the available actions for each policy.

The algorithm creates an ε-optimal policy, minimizing average costs. The algorithm continues until the span between two consecutive time periods n − 1 and n is less than ε. The span at period n is defined as the difference between the maximum and minimum value given all possible knowledge states:

sp(vn) = max

πi,j_∈Ω0v

n_(πi,j_{) − min} πi,j_∈Ω0v

n_(πi,j_).

Using the span, we ultimately make the cost independent of the initial state, v0_(πi,j_{), since the difference}

in values for all knowledge states becomes negligible small. When difference in values is small enough, ε-optimal, the algorithm will terminate. A lower bound of the actual costs is given by the minimum, whereas the maximum gives an upper bound of costs. As they get closer, the costs get independent of the initial state.

The Value Iteration Algorithm can then be specified by: 1. Select starting value v0∈ V , specify ε > 0 and set n = 0 .

2. For i = 1, 2, . . . , m and j = 0, 1, 2, . . . , N compute vn+1(πi,j) as is explained above.

Note that in this above setting, the infeasible actions are not allowed, and hence never performed. 3. If sp vn(πi,j) − vn−1(πi,j) < ε, for all πi,j∈ Ω, go to step 4.

Otherwise, increment n by 1 and return to Step 2.

4. For each πi,j_{∈ Ω, choose the ε−optimal decision which minimizes costs, given by}

dε(π1,0) = DN,

dε(πi,0) ∈ arg min a∈AP OL(πi,0)

DN (πi,0_{, n), P M (π}i,0_{, n) ,} _{for i = 2, . . . , m,}

dε(πi,j) ∈ arg min a∈AP OL(πi,j)

DN (πi,j_{, n), P M (π}i,j_{, n), IN (π}i,j_{, n) , for i = 1, . . . , m, j = 1, 2, . . . , N,}

and stop.

Note that again, the action should be contained in the action space given the policy, a ∈ AP OL(πi,j),

where P OL is either, PI/PMI, PI/PMD, AI/PMI, or AI/PMD. Furthermore, the number of knowledge states, N , should be chosen large enough, such that for every policy and every knowledge state πi,j_,

(14)

Upon termination of the algorithm, the minimum and maximum, the tightest lower bound of the average costs per time period, are given by

min

πi,j_∈Ω{v

n+1_(πi,j_{) − v}n_(πi,j_)}, _{for i = 1, . . . , m and j = 1, . . . , N,}

and the tightest upper bound of the average costs per time period are given by max

πi,j_∈Ω{v

n+1_(πi,j_{) − v}n_(πi,j_)}, _{for i = 1, . . . , m and j = 1, . . . , N.}

The average of the two is used as an approximation of the optimal costs per time period g∗, given by

g∗=minπi,j∈Ω{v

n+1_(πi,j_{) − v}n_(πi,j_{)} + max}

πi,j_∈Ω{vn+1(πi,j) − vn(πi,j)}

2 .

5 Numerical experiments

In this section, we will construct numerical experiments, to implement the results of Section 3 on the Markov chain of Section 4.3 and apply the value iteration algorithm of Section 4.4 to obtain knowledge on several policies. The policies considered are explained in Section 4.1.

In the analysis, we will use a gamma deterioration process with parameters a = 5, b = 0.2 and failure level L = 1. First we approximate this continuous process by n = 4 deterioration states, and a fifth failed state (∆x = 1/4). We consider a total of N = 100 knowledge states per deterioration level and costs equal ci = 0.2, cpm= 1 and ccm= 5.

We start with varying the length ∆t of the time steps in the discretization approach, in order to gain insight on the effect of a relative close approximation to reality, versus a case where time steps are larger. Reality could be described as time steps that are infinitesimally small, whereas larger time steps could thus be described as farther from reality.

After this, we will change the inspection interval, the number of periods between inspections, for the PI/PMI and PI/PMD policies. Note that this does not affect the AI/PMI and AI/PMD policies. It is of interest to see what can be gained with aperiodic inspections. Thereafter, we differ, m, the number of states used in the discretization approach. When we have more states, we are closer to reality, which one could speak of as an infinite amount of states.

We end this section with a larger number of states, and we present the time until the next action as a function of the observed degradation level. Current literature, e.g. Dieulle et al. (2003), considers linear inspection scheduling functions, it is of interest how the optimal inspection scheduling functions look like.

5.1 Time steps

We start with the results of changing the length of the time steps, ∆t. For both periodic policies, for every time step ∆t considered, the optimal inspection interval T is determined using numerical analysis. At different time steps, we thus might have different inspection intervals for the periodic policies.

(15)

Figure 1: Policies for different time steps ∆t

It can be observed that the AI/PMD policy outperforms the other ones, albeit only a minor improvement to the PI/PMD policy. There is a difference visible between the optimal policy and the policies that only allow PM at an inspection, however. We could thus say that delayed PM actions have a possible gain compared to PM only at an inspection, whereas the possibility of aperiodic inspections gives a somewhat smaller improvement. For the periodic policies, the possibility of a delayed PM action gives a larger gain compared to the aperiodic policies.

The overall expected trend observed is increasing costs for a larger time period, ∆t, considered. This was the expectation beforehand, as larger time steps gives an approximation that is farther from reality. Probabilities to jump states, or even fail within a certain period will become larger as ∆t increases.

5.2 Inspection intervals

Next, we can also check the difference in performance when changing inspection intervals for the periodic policies. In practice these have to be optimized, but it is interesting to see how the performance is affected for different values of T .

(16)

Figure 2: Average long term expected costs for different inspection interval T

The aperiodic policies are obviously unaffected by changing T , as inspections are always allowed. The AI/PMD policy again performs best of all policies. Furthermore, it can be observed that for both periodic policies, PI/PMI and PI/PMD, the performance changes dramatically when choosing a inspection interval different from the optimum. It can be observed that the optimal inspection interval differs for both policies, with T = 6 optimal for the PI/PMI policy and T = 7 performing optimal for the PI/PMD policy, at a cost level almost equal to the optimal AI/PMD policy.

This can be explained by the fact that the optimal policy for aperiodic and periodic inspection scheduling are very alike when delayed PM is allowed. Especially for this experiment, with a small number of states considered, the difference in performance is almost negligible.

The optimal action tables for all policies, given the last observed state 1, . . . , 4 and time periods since the last inspection vertically, are presented below. Note that the table corresponding to the PI/PMI policy only has six rows, as for all states, an action different from DN is chosen in at most 6 time periods, the optimal inspection interval for the PI/PMI policy in this example.

Cells that are left empty in the tables correspond to state-time couples that are never reached, since an action is already performed earlier. Note that when performing an inspection, we observe a certain state, and the number of time periods since the last inspection will be 0, which implies we end up somewhere in the first row of the tables. A PM action maintains the unit to the as-good-as-new state, which implies that we end up at the top-left entry of the tables.

(17)

Table 1 presents the least interesting results, as delayed PM is not allowed in this policy and we deal with periodic inspections. Since PM has to be performed after an inspection, the optimal inspection schedule inspects the system after six periods when state 1 or 2 is observed, and immediate PM is performed when we are in the more deteriorated state 3 or 4.

Table 2 shows the benefits of a delayed PM action for periodic policies. Both in state 2 after 3 periods, as well as in state 3 after one period, a delayed PM action is performed. Note that as the optimal inspection interval equals M = 7, which is also optimal in the AI/PMD policy.

Table 3 shows the benefits of aperiodic inspections, as an inspection is scheduled after 7 time periods when state 1 is observed, and an inspection is scheduled after 4 periods when we observe state 2. PM is only allowed after an inspection, and will hence be performed immediately when we observe the more deteriorated states, state 3 and 4.

In Table 4, one can again see the benefits of delayed PM, as this is the ideal action when state 2 or 3 is observed. Given this state, the best action is to perform PM after 5 periods, or 1 period respectively. Due to the combination of aperiodic inspections and option of delayed PM, the AI/PMD policy performs at least as good as the other policies.

What is interesting is that the AI/PMI policy, presented in Table 2 has different optimal actions compared to the PI/PMD policy, yet they arrive at almost equal optimal costs. This leads to the conclusion that the optimal policy might not unique in this case.

5.3 Number of deterioration states m

Another interesting example is increasing the number of deterioration states for the discretization approach. Again insights can be obtained regarding reality, which could be seen as having an infinite amount of states. The expectation is that policies will perform better when there are more states, as for a higher number of states, we get closer to reality. The parameters are chosen as before, with ∆t = 0.01, and varying number of states m = 4, thus a total of m + 1 = 5 states, until m = 24. Periodic policies have an optimal inspection interval T , determined for each number of states we consider. Results of the four policies are presented in Figure 3.

Figure 3: Policies for different number of deterioration states

(18)

PI/PMD and AI/PMD, to perform similar. There is a small gain when we compare the PI/PMI and the AI/PMI policies, which has become very small when we compare the PI/PMD and AI/PMD policies. The main conclusion from this experiment is thus that the effect of aperiodic inspections is observable, albeit somewhat small when cost minimizing maintenance scheduling. The effect of possible delayed PM is larger, as the costs can be significantly decreased when applying delayed PM.

5.4 Optimal action function

Next we present results on the optimal action given an observed deterioration level. In this subsection, only the aperiodic policies will be considered, as they are most interesting. We start with the AI/PMI policy, where the PM action is thus only allowed at an inspection. The optimal action, as function of the observed state is presented in Figure 4.

Figure 4: AI/PMI: optimal action given observed state

It can be seen that until state 28 is observed, the optimal action is to inspect the system after a certain number of periods. From an observed state 29 onwards, immediate PM will be performed. It is interesting to note that the function appears not exactly linear, but sometimes the number of periods until an inspection decreases somewhat faster when we add one more state to the model. Up to now, the literature considers linear functions, which seem to already give a good representation, but from Figure 4 it appears that results could slightly be improved when not forcing a linear structure on the inspection schedule.

(19)

Figure 5: AI/PMD: optimal action given observed state

Interestingly, when comparing Figure 4 and Figure 5, we observe that the AI/PMD policy performs an inspection earlier than the AI/PMI policy. This means that when applying the AI/PMD policy, more certainty is preferred compared to the AI/PMI policy, as a delayed PM might be imposed, which in turn proved beneficial in saving costs.

The delayed PM action can be observed to be optimal from an observed state 15 onwards. It is then optimal not to inspect the system, as the risk of having to perform PM anyway becomes too high, and when the actions have to be combined, both inspection costs as well as PM costs are faced. From an inspected state 39 onwards, immediate PM will be performed, thus somewhat later than with the AI/PMI policy, due to the possibility of delayed PM.

Both the inspection schedule as function of time, and the delayed PM action as function of time, are seen not to be exactly linear. A linear approximation, however, will again likely give a good representation.

6 Conclusion

In this paper, we considered a single unit that deteriorated continuously over time, and for which inspections were needed in order to observe the actual deterioration level. We introduced a discretization approach, in order to transform the continuous-state, continuous-time deterioration process into a state, discrete-time deterioration process. Using this discretization approach, a POMDP was formulated, which was in turn solved using the value iteration algorithm.

Four different policies were applied to the POMDP, with (a)periodic inspections, and a PM action either at inspection, or possibly delayed. It was thus of interest what could be gained using an aperiodic inspection policy, compared to a periodic inspection policy, as well as the benefits of delayed PM actions, compared to the obligation to combine a PM action with an inspection, and ultimately the combination of aperiodic inspections and the possibility of delayed PM.

(20)

delayed PM action compared to the limitation to combine PM with an inspection was especially notable, whereas the gain of aperiodic inspections was somewhat smaller, but still visible.

After this, we continued with a larger number of states, and smaller time steps for the discretization approach, in order to obtain results relatively close to reality. We showed the time until the next action, either inspection or PM, as a function of the observed deterioration level. Current literature considers linear functions, whereas we observed slightly different transitions as opposed to linearity. Comparison between aperiodic policies shows that when delayed PM is allowed, inspections are performed sooner, which can be explained by the fact that we want more certainty, as delayed PM can be scheduled when necessary.

Further research could investigate the exact shape of the actions as a function of the observed state in more detail, as the current linearity assumption might be at least slightly improved, as shown in this study. Besides, extensions with imperfect inspections, or imperfect repairs are a possible addition to this model for future research in order to make the model more realistic. Another extension will be to consider downtime costs, and non-negligible times when performing the actions. Such a setting is more realistic, as a system that is not running likely incurs costs due to delayed production. Besides, the assumption that actions take a negligible amount time might be strong compared to reality.

References

Abdel-Hameed, M. (1975). A gamma wear process. IEEE Transactions on Reliability 24 (2), 152–153. Berrade, M.D., C.A.V. Cavalcante, and P.A. Scarf (2013). Modelling imperfect inspection over a finite

horizon. Reliability Engineering & System Safety 111, 18–29.

Castanier, B., C. B´erenguer, and A. Grall (2003). A sequential condition-based repair/replacement policy with non-periodic inspections for a system subject to continuous wear. Applied Stochastic Models in Business and Industry 19 (4), 327–347.

De Jonge, B., R. Teunter, and T. Tinga (2017). The influence of practical factors on the benefits of condition-based maintenance over time-condition-based maintenance. Reliability Engineering & System Safety 158, 21–30. Dekker, R. (1996). Applications of maintenance optimization models: a review and analysis. Reliability

Engineering & System Safety 51 (3), 229–240.

Dieulle, L., C. B´erenguer, A. Grall, and M. Roussignol (2003). Sequential condition-based maintenance scheduling for a deteriorating system. European Journal of Operational Research 150 (2), 451–461. Do, P., A. Voisin, E. Levrat, and B. Iung (2015). A proactive condition-based maintenance strategy with

both perfect and imperfect maintenance actions. Reliability Engineering & System Safety 133, 22–32. Flage, R., D.W. Coit, J.T. Luxhøj, and T. Aven (2012). Safety constraints applied to an adaptive bayesian

condition-based maintenance optimization model. Reliability Engineering & System Safety 102, 16–26. Golmakani, H.R. and F. Fattahipour (2011). Age-based inspection scheme for condition-based maintenance.

Journal of Quality in Maintenance Engineering 17 (1), 93–110.

Grall, A., C. B´erenguer, and L. Dieulle (2002). A condition-based maintenance policy for stochastically deteriorating systems. Reliability Engineering & System Safety 76 (2), 167–180.

Grall, A., L. Dieulle, C. B´erenguer, and M. Roussignol (2002). Continuous-time predictive-maintenance scheduling for a deteriorating system. IEEE Transactions on Reliability 51 (2), 141–150.

Maillart, L.M. (2006). Maintenance policies for systems with condition monitoring and obvious failures. IIE Transactions 38 (6), 463–475.

(21)

Puterman, M.L. (2014). Markov decision processes: discrete stochastic dynamic programming. John Wiley & Sons.

Van Noortwijk, J.M. (2009). A survey of the application of gamma processes in maintenance. Reliability Engineering & System Safety 94 (1), 2–21.