Joint Optimisation of Maintenance and Inventory with Perfect and Imperfect Repairs

(1)

Joint Optimisation of Maintenance and

Inventory with Perfect and Imperfect Repairs

O.J.J. Meijburg

(2)

Master’s Thesis Econometrics, Operations Research and Actuarial Studies

Supervisors: prof. dr. R.H. Teunter H. Holtman

(3)

Abstract

(4)

Joint Optimization of Maintenance and Inventory

with Perfect and Imperfect Repairs

Oscar Meijburg

September 18, 2016

1 Introduction

Maintenance is defined as all actions performed in order to retain equipment in, or restore it to, a given condition (Dhillon, 2002). These actions are required as equipment may deteriorate and eventually fail as a result of age, usage, and random shocks (Olde Keizer and Teunter, 2014). Effectively and efficiently per-forming maintenance is complicated by the fact that deterioration and failures generally occur randomly, making it difficult to accurately predict the right time to perform maintenance. Performing maintenance too early leads to unneces-sary costs due to performing more maintenance than is required, while waiting longer with performing maintenance increases the risk of a failure, and all costs related to it. For manufacturing firms, maintenance costs often constitute a large part of total production costs, sometimes up to 70%, and up to a third of those costs are incurred unnecessarily (Bengtsson, 2007).

(6)

most efficient (Prajapati et al., 2012; Camci, 2009), as using time-based main-tenance strategies tends to schedule mainmain-tenance earlier than necessary. CBM however has the benefit of simultaneously being able to postpone maintenance and prevent breakdowns, by taking the condition of the equipment into account. Therefore, it should be the strategy of choice when it is possible to monitor the deterioration level and likelihood of failure of the equipment.

Most existing research focuses on systems containing a single component (Hong et al., 2014), but the resulting optimal policies cannot always be translated directly to optimal policies for systems with multiple components, which are common in practice. According to Nicolai and Dekker (2008), three types of dependencies between components can be distinguished. Economic dependen-cies exist either when costs can be saved by maintaining multiple components at once, or when downtime due to maintenance on several machines simultane-ously leads to increased costs, for instance because production capacity becomes too low. Secondly, structural dependency occurs when multiple components are part of the same structure, such that for maintenance of one component it is re-quired to also perform maintenance on one or more other components. Finally, stochastic dependency exists either when the state of a component influences the deterioration rate and failure probability of other components, or when external factors may affect the deterioration rates or cause failures of multiple compo-nents simultaneously. A type of dependence however that will be discussed in this thesis, but does not fit into any of these categories, is when multiple com-ponents in a system require the same type of spare part to be maintained, and thus make use of a shared pool of spare parts. If these spare parts cannot be procured instantly, then it is not possible to perform maintenance if there is no spare part on stock. The requirement for spare parts thus means that the decisions regarding spare parts inventory should be coordinated simultaneously with maintenance decisions. Existing research on the combined optimization of decisions on maintenance and inventory is limited however, as most research considers only one of these aspects (Van Horenbeek et al., 2013).

(7)

closely related work to this is done by Olde Keizer et al. (2016), which has the only significant difference that no imperfect maintenance is considered.

The optimal decisions with regards to maintenance and inventory control will be found by formulating the problem as a Markov Decision Process (MDP). The MDP can then be solved to find an optimal policy, which comprises an opti-mal decision for each possible state the system can be in, maximising the mean long-run rewards, or in this case minimising costs. The main challenge of using MDPs to find optimal policies for maintenance of multi-components systems, is caused by the so-called “Curse of Dimensionality”: the total number of pos-sible states increases exponentially as a function of the number of components (Dekker et al., 1997). The requirement for spare parts on stock and on order to be represented in the state of the system further increases the complexity of the problem. Methods do exist however to reduce this effect by making use of the underlying structure of the system (Kolobov, 2012). In this thesis these methods have been investigated and applied where possible.

(8)

2 Problem Formulation

The system under consideration consists of N components. Each component n, n = 1, 2, . . . , N can be in M different deterioration states 1, 2, . . . , M , where state 1 is the good as new state and M is the failed state. A discrete-time system is considered, and at the start of each period the state of all components becomes known, after which decisions with regards to maintenance of each component and the ordering of spare parts are made. Deterioration occurs in discrete steps, and it is assumed that it cannot improve the condition of components. We thus assume the deterioration rate of components is Poisson distributed with rate λ, so the expected lifetime of each component is T = M −1_λ periods. The components are identical, and thus make use of a shared pool of spare parts. For each component, two different types of maintenance are available, while it is also possible to not perform any maintenance on a component, i.e. do nothing (DN). The first maintenance type is a replacement, consuming a spare part and returning the component to the good-as-new state. Since a replacement requires a spare part, the number of components that are replaced cannot exceed the available number of spare parts. Secondly, Imperfect Maintenance (IM) can be performed, reducing the current state by a given number of δ levels. In case δ exceeds the number by which the deterioration state of a component can be reduced, then that component is returned to the good-as-new state 1. Imperfect maintenance cannot be performed when a component has already failed; in this case it is only possible to replace it or do nothing. Both types of maintenance are assumed to take no time, as in practice repair or replacement times are negligible compared to the total lifetime of the components (Olde Keizer et al., 2016). As a result components can still deteriorate as in the period during which they are maintained. Spare parts will arrive after a fixed lead time of L time periods. Upon arrival, these can be immediately used for possible replacements.

2.1 Cost Specification

(9)

3 Markov Decision Process Formulation

3.1 Introduction to MDPs

For the joint maintenance and inventory problem at hand, changes in the prop-erties of the system are partially the result of random effects, and partially of the decisions made. For these types of problems, MDPs offer a mathemati-cal framework for modelling the system and the decisions to be made. MDPs typically consider discrete-time problems, though variations exist which allow for the modelling of continuous-time problems. In this case, we will use the discrete-time formulation and model the problem as such. Continuous-time evaluation and decision making can still be approximated this way by making the time intervals small enough. The points in time at the start of the periods when the system is evaluated and decisions are made are the so-called decision epochs. There are four elements which constitute an MDP. The first is the set S of all states in which the system can be. At each decision epoch, the system must be in one of these states. The second is the set of all available actions A which can be performed. This set comprises all actions for the system, though it is possible that some actions cannot be performed when in a certain state. The third element is the set of transition probabilities P. This set contains for each action an |S| × |S| matrix, specifying for that action the probability to go from each possible current state to each possible new state. The total size of this set is thus |A| × |S| × |S|. Finally the reward function R of size |S| × |A| specifies a numeric reward for each combination of an action and the state in which it is taken. In some cases the reward value also depends on the new state that is reached after an action has been performed, but in this case it does not. Furthermore, an important property is that both the transition probabilities as well as the reward values are time-independent. Note also that the transition probabilities and reward values only depend on the current state, and not on the sequence of states leading up to that state; this is called the Markov property.

The solution of an MDP consists of a policy π, which assigns an action to be chosen to each state, and an objective value. The objective value can either be the Utility, defined as the current reward obtained from using a certain ac-tion in a state plus all expected future rewards after having used that acac-tion, or the gain, defined as the long-run average utility per time step. Since the total utility will often not be bounded for infinite-time problems such as ours, the gain is of more appropriate use. There are several methods for finding the optimal policy which maximises these rewards. The two most commonly used, both dynamic programming approaches, are the policy iteration algorithm and the value iteration algorithm. Since the state space expands exponentially as the number of components increases, the value iteration algorithm is the most appropriate (Spieksma and N´unez-Queija, 2015).

(10)

gain g(s) per state can then be acquired by taking the difference between two consecutive iterations. The gain has converged when the difference between the maximum and minimum gain is smaller than the specified parameter . At this point the optimal policy and its respective mean long-run reward per time pe-riod have been found. The steps performed by the value iteration algorithm are given below.

1. Set iteration n = 0 and Un_{(s) = 0 for each state s ∈ S}

2. Increment n by 1 and for each s ∈ S compute Un_{(s) by:}

Un(s) = max a∈A

n

r(s, a) +X i∈S

p(i|s, a)Un−1(i)o (1)

Here r(s, a) denotes the reward of using action a in state s, and p(i|s, a) denotes the probability of ending up in state i when using action a in state s.

3. Calculate the maximum and minimum gain by gmax= max s∈S{U n_{(s) − U}n−1_{(s)} and g} min= min s∈S{U n_{(s) − U}n−1_(s)}

4. If gmax− gmin< , then go to step 5. Otherwise, return to step 2. 5. Select the optimal action for each state s ∈ S by:

π(s) = arg max a∈A

n

r(s, a) +X i∈S

p(i|s, a)Un−1(i)o (2)

(11)

3.2 The Curse of Dimensionality

This subsection will include all requirements for the input sets of the MDP to describe the complexity of the problem. A full description of the way these input sets are formulated will be provided in section 4. The Markov property states that rewards and transition probabilities must only depend on the current state, and not on any preceding states. All relevant information on the current status of the system should thus be represented in a state. Because of this, not only the deterioration level of each component and the number of spare parts on stock must be represented in the state of the system, but also the number of spares that have already been ordered and will arrive 1, 2, . . . , and L − 1 periods from the current decision epoch. In order for the state space to be finite, which is required in order to solve the MDP using the value iteration algorithm, a bound must be set for the spare parts on stock and on order. This can be done by setting an upper limit to the inventory position (total number of spare parts on stock or on order). Since this constraint only exists due to computational requirements, this upper limit P should be set high enough such that is does not influence the optimal policy found.

(12)

The number of spare parts that can be ordered is at most equal to P , and it is also possible to order 0 spare parts, hence there are P + 1 different actions available in terms of ordering spare parts. This gives the total size of the action space |A| as (P + 1) · 3N.

It can already be seen that both the action space and the state space increase ex-ponentially in N , but the real problem in terms of complexity arises when defin-ing the set of possible rewards and in particular for the transition probabilities. As stated previously, the size of the set of rewards R is |S| × |A| while the set of transition probabilities has size |A| × |S| × |S|, which is (P + 1) P +L_P 2

3N_M2N_. Even before the value iteration algorithm can be used, this set has to be gener-ated and stored, requiring a large amount of time and memory when N increases. The algorithm itself then also suffers as a result of the size of P, as values from this set must very frequently be looked up and used for the computation of the utilities.

(13)

deterministic, and thus follow directly from the current state and the action taken. As a result the full set of transition probabilities is not required for these states: an indication of the new state for each combination of state and possible action suffices. For this, at most |A| × |S| elements would be required, instead of all |A| × |S| × |S| elements from the set of transition probabilities.

(14)

4 The model

Alongside the main model which aims to find an optimal solution to the joint maintenance and inventory decision problem in an efficient manner, several vari-ations of the model were created. Some of these alternatives were designed to either determine the computational advantages in terms of solvability and effi-ciency by implementing improvements in how the solution is generated. Other alternatives were designed to produce possible strategies which defer from the optimal policy, in order to compare the objective values of several strategies with each other.

4.1 Flat State Space Formulation

Initially, the model was designed making use of the flat representation of the MDP. For this, an enumerated list covering all states was created. Each item in the list contained details on the state, including information on the deterioration state of each component, and spare parts on stock and on order. So even though the number of a state by itself does not provide information on the properties of that state, looking up that number in the list with states will provide that information. The same was done for the set of actions, a numbered list of ac-tions is created. For each action the list shows the specific maintenance action for each component, as well as the number of spares ordered in that action.

From these lists, a total set of |A| × |S| × |S| transition probabilities can be generated. This set is formulated as an |S| × |S| for each possible action. The first step towards creating this set is made by generating a M × M matrix p of transition probabilities for the deterioration of a single component. Since the deterioration of each component is Poisson distributed with rate λ, the el-ements of this matrix can be computed as follows. Let i be the state before deterioration and j be the state after deterioration. Because of the Poisson dis-tributed deterioration the state cannot improve as a result of deterioration, we have p(i, j) = 0 for i > j. When i ≤ j we have p(i, j) =λj−i_(j−i)!e−λ for i ∈ 1, . . . , M and j ∈ 1, . . . , M − 1. The probability for a component to fail, which happens

when j = M is given by p(i, M ) = 1 − e−λ j−i

P k=0

λj−i

(j−i)!. The matrix p now gives the transition probabilities for a single component when no maintenance is per-formed. When maintenance is performed, the state is first set to max(i − δ, 1) for imperfect maintenance and to 1 for a replacement. After this the component will deteriorate. To take this into account, we introduce the intermediate state ¯i. The action performed on a component can either be Do Nothing (D), Imper-fect Maintenance (I), or Replace (R), and is denoted by a, with a ∈ {D, I, R}. The intermediate state is then given by:

(15)

Since components can still deteriorate in the period when maintenance has been performed, the probability for a single component to deteriorate to state j when action a has been performed can be found by looking up p(¯i(a), j). The com-bined deterioration probabilities for all components can then be determined by multiplying the probabilities from the deterioration matrix of the single com-ponent. If the N components start in deterioration states i1, i2, . . . , iN and maintenance actions a1, a2, . . . , aN are performed, then the probability to end up in deterioration states j1, j2, . . . , jN will be

N Q n=1

p(¯in(an), jn).

The properties of the state in terms of spare parts follow deterministically from the previous state and the action taken. We denote the properties of the states regarding spare parts that will arrive t periods from now by st with t = 0, 1, . . . , L − 1, where s0 denotes the spare parts that are already on stock. If the number of spare parts on stock and on order for the current state is given by (s0, s1, s2, . . . , sL−2, sL−1) and in the selected action a total

number of nR(a) = N P n=1

1an=R components are replaced and aO spare parts

are ordered, then the new properties regarding spares are given by (s0+ s1− nR(a), s2, s3, . . . , sL−1, aO). The new state must have these properties regard-ing spares on stock or on order, otherwise the state cannot be reached with the action used and the probability will be 0. Writing out the full state where elements 1, . . . , N represent deterioration levels, and elements N + 1, . . . , N + L spare parts on stock or on order, the the probability to go from state x =

(i1, . . . , iN, s0, s1, s2, . . . , sL−2, sL−1) to state y = (i1, . . . , iN, s0+s1−R, s2, s3, . . . , sL−1, aO) with action a = (a1, . . . , aN, aO) is given by

N Q n=1

p(¯in(an), jn). This allows us to generate the set P containing a full |S| × |S| matrix for each action a ∈ A.

The set of rewards R is formulated as an |S| × |A| matrix, specifying a re-ward acquired from each combination of state and action used. Since we have a cost minimisation problem while the value iteration algorithm always maximises rewards, rewards are set as the negative of their associated costs, and thus all rewards are non-positive. To ensure that the value iteration algorithm never selects an infeasible action, we first set the rewards of all infeasible actions to an extremely low number; in this case −109 _{was used. The following properties} can make an action infeasible:

• The maximum inventory position may not be exceeded. This means all actions where the number of spare parts ordered minus the number of replacements performed is greater than the difference between the maxi-mum inventory position and the inventory position of the current state, is infeasible, thus when aO− nR(a) > P −

L−1 P k=0

(16)

• The number of replacements cannot exceed the number of spare parts available: any action a for which nR(a) > s0is infeasible.

• Imperfect maintenance cannot be performed on a component that is in the failed state. So if ∃n ∈ 1. . . . , N : in= M ∧ an = IM, then a is infeasible.

At this point all infeasible actions have reward −109 _{and all feasible actions} reward 0. The final rewards can then be computed by subtracting all costs (as specified in section 2.1) from the rewards that are applicable to the feasible actions of each state. The reward based on the state and action used, given that the action is feasible in said state, can then be written as follows.

R(s, a) = −cr· nR(a) − ci N X n=1 1an=I− cd N X n=1 1in=M− cs1nR(a)>0− chs0− coaO

4.2 Factored Formulation

As suggested in the previous section, a factorisation can be made by separat-ing the states into two variables, one indicatseparat-ing the deterioration states of all components and one indicating the number of spare parts on stock and on or-der. For the set of states, two numbered lists are now kept. We denote the set of deterioration states by Sd and the set of all inventory states regarding the status of spare parts by Si. The full state space then consists of all pos-sible combinations of one state from Sd and one from Si. The set of actions A is defined in the same way as in the flat state space formulation. We now need two sets of transition probabilities, one to determine the change in the deterioration state based on the selected action, and one for the change in the inventory state. For the inventory state however, the new state is fully deter-ministic and follows directly from the previous state and the action performed. A transition probability matrix for every action would therefore be very sparse and would contain only a single 1 on every row. This allows for a more compact formulation. Instead of a full |Si| × |Si| matrix for every action a ∈ A, a single |A| × |Si| denoting the number of the new inventory state based on the current inventory state and action would suffice.

(17)

transition probabilities for all N components, from the intermediate state to the deteriorated state. Note that in total these two matrices are far smaller than the full set of |Sd| × |Sd| matrices for each action a ∈ A that would be required to provide the transition probabilities directly from the initial state to the dete-riorate state. This reduction in size not only reduces the required memory for storing the transition probabilities, but also improves computation speed. The rewards are structured in the same way as in the flat state space formulation. Splitting up the rewards in the same way, by specifying rewards for deteriora-tion states and rewards for inventory states can be done by generating the set of rewards as two matrices, one of size |A| × |Sd| and one of size |A| × |Si|. This can be done because there are no interdependencies in these two sets of rewards. The cost for replacement for instance only depends on the inventory state as there must be sufficient spares available, but the cost of replacing a component does not depend on its deterioration state. Similarly, the cost for imperfect maintenance only depends on the deterioration state but is independent of the state of spare parts. The combined reward can simply be found by adding the reward of the inventory state and the reward of the deterioration state based on the selected action.

4.3 Unique States Formulation

(18)

example will illustrate this.

If we consider a two component system, then previously the deterioration state (i1, i2) indicated that component 1 was in state i1and component 2 in state i2. A transition to state (j1, j2) could simply be found by multiplying the probabil-ity of a component moving from state i1 to state j1 with that of a component moving from state i2to state j2. In the formulation however, state (i1, i2) only indicates that there is one component in state i1 and another in state i2. A transition to state (j1, j2) could still mean that component 1 deteriorates from state i1 to j1 and component from i2 to j2, but for instance also that compo-nent 1 deteriorates from i1 to j2 and 2 from i2 to j1. The total probability of transitioning from (i1, i2) in the unique states formulation, is given by the sum of the following four probabilities of the original formulation:

• From (i1, i2) to (j1, j2), • From (i1, i2) to (j2, j1), • From (i2, i1) to (j1, j2), • From (i2, i1) to (j2, j1)

(19)

4.4 Alternative Solutions

The three proposed alternatives of model formulations can be used to determine computational advantages gained by designing more efficient models of the sys-tem. The next variations will all make use of the most efficient model, but will use it to generate different solutions besides the optimal solution. Although the value iteration algorithm is designed to always return the optimal policy, it can also be used to generate some alternate sub-optimal policies by modifying the problem formulation. The results from this can be used to compare the optimal policy to deviating alternatives. A simple way of ensuring a sub-optimal pol-icy is found, is by making some actions infeasible in certain states. The value iteration algorithm will then find an optimal solution in the reduced solution space. Though the policy found here may be equal to the optimal policy, ad-equate limitations on the solution space will ensure that a different and thus inferior policy is generated that can be used to determine the benefit of using the optimal policy over alternative policies.

For the first alternative policy, the system will be forced to make use of the (s, S) inventory policy. In an (s, S) policy, new items are ordered as soon as the inventory position (number of items on stock and on order) reaches a specified level s. At this point, an order is placed, for a number of items such that the new inventory position will be S. Making the following modifications to the rewards, by making certain actions infeasible, will ensure the system makes use of an (s, S) policy:

• When an order is placed, the number of items ordered must always bring the new inventory position exactly to the level S. Thus, actions which do not order this amount or 0, are always infeasible.

• Orders must be placed when the inventory level s0 drops to a value of s or lower. Thus actions which order 0 in these situations will be set to infeasible. Note that an order is only placed the moment s0falls to a level of s or lower, which only happens when replacements are made.

• Orders cannot be placed when the inventory level s0 is greater than s, even after replacements. Therefore all actions which in this case order more than 0 items are set to infeasible.

(20)

The second alternative implemented is the sequential optimization of mainte-nance and inventory decisions. This is done by first optimizing the maintemainte-nance decisions without taking spare parts into account (and thus assuming that a spare is always available.) This can easily be implemented in the existing sys-tem by setting the holding and ordering costs of spare parts to 0, raising the maximum inventory position to a level that is sufficiently high, and setting the lead time L to the lowest possible level 2. This way the system will ensure that enough spare parts are always present to perform possible replacements. The optimal policy in terms of maintenance is then generated first, and used to subsequently find a complementary ordering policy. We can simply force the system to make use of the given maintenance policy by making all deviating actions infeasible through setting their rewards to −109_{. Note that doing this} may for some states have the result that no feasible actions are available. This happens when the maintenance action selected in the first step cannot be per-formed due to unavailability of sufficient spare parts, when inventory is also considered. To prevent this from happening, the system will only be forced to adhere to the maintenance policy from the first step in states where the action of that policy is feasible. In case a the maintenance action determined in the first step is infeasible, it will also be allowed to do nothing. In practice that means that if the maintenance policy dictates that a component should be replaced but there is no spare part for this replacement, then the Do Nothing action will be selected until a spare part arrives, at which point the intended replacement can be performed.

(21)

5 Results

5.1 Computational Efficiency Analysis

In this subsection, the models using the flat state space representation, factored representation, and the unique deterioration states will be compared in terms of computational capabilities. We will look at the maximum input sizes in terms of the parameters M , N , and L that each formulation is able to solve, and the computation time required to find the optimal solution. The influence of individual input parameters on computation speed will also be assessed for the various formulations. Note that the value of the maximum inventory position P also affects the size of the problem, as both the number of states and the number of actions depend on P . P however is not an input parameter that follows directly from the problem, but is a level to be set by the user, at a level sufficiently high such that the optimal solution is not affected. This level is dependent on the values of the input parameters. An increase in for instance N may therefore not only directly result in an increase in the number of states and actions, but also indirectly, as it may require a higher level of P to be set.

All models were programmed using a combination of R and C++. All time consuming computations such as generating sets of rewards and transition prob-abilities and running the value iteration algorithm were performed using C++, as this was more time-efficient. Computations were done using a computer with 8GB of RAM and a 2.5GHz Intel Core i5 processor. For the first formulation, the MDPToolbox package in R was used to run the value iteration algorithm. This formulation explicitly requires the set of transition probabilities to be given as input in the form |A| × |S| × |S|. Since the other two formulations do not specify the transition probabilities in this form, an alternative solver had to be designed. This solver, implemented in C++, is an implementation of the value iteration algorithm that was specifically designed to solve the factored version of the joint maintenance and inventory decision problem. Though a specifically catered solver has its inherent advantages, the downside is that it may not have been programmed as efficiently as possible, in particular compared to the pro-fessionally designed MDPToolbox solvers.

(22)

level of P that is required. The following costs will be used for the remainder of this subsection: cr= 5 ci= 5 cs= 2 cd= 100 ch= 0.5 co= 1

Note that cdis far higher than the other cost parameters, indicating that failures are very undesirable. Using these settings, it was found that a level of P = 2 is sufficiently high: increasing P to 3 did not cause any improvements in the opti-mal solution found. From the input parameters, the sets of possible states and actions can be generated, and subsequently the sets of transition probabilities and rewards. With the above settings, the flat state space formulation take 0.18 seconds to generate the input sets S, A, P, and R, of which 0.17 seconds were used for generating P. From this input, the value iteration algorithm solver from the MDPToolbox was able to find the optimal solution in 0.514 seconds. When increasing the number of components to 3, P is 2 still suffices. The failure rate is relatively low, so the chance of all components failing and thus requiring a spare part is so small that having a spare part on stock for each component is never profitable in the current setting. With the increased number of compo-nents, it now takes 9.85 seconds to generate the input. Of this 9.83 seconds were required to generate P and 0.02 seconds for R. The time required to find the optimal solution was 2.14 seconds. Clearly the generation of the set P was most heavily influence by the number of components, which makes sense because of the way P is formulated here. The MDPToolbox solver still managed to make use of P rather efficiently, likely resulting from the implementation of P as a sparse matrix in our model. P is indeed very sparse, caused by the large num-ber of partially deterministic transitions, making the probability of deviating transitions equal to 0.

Further increasing the number of of components to 4 requires the maximum inventory position P to be set to 3. When using these input settings, an error occurs when generating the set P, as the maximum size of 8GB is reached during the generating process. Since the full matrix cannot be generated, the problem cannot be solved using the current formulation. The 4 component system can be solved using the flat state space representation when the lead time L is lowered to 2 periods. This also allows for P = 2 to again be used without affecting the final solution. This reduces the size of P enough such that the memory limit is no longer exceeded. Generating the input sets took 165 seconds and the optimal solution was found in 23.4 seconds.

(23)

required for the generation of the set of transition probabilities P. The factored model formulation offers improvement in this aspect, as the transition proba-bilities are defined far more compactly. The size of P is reduced further when using the unique deterioration states formulation, as a result of requiring fewer deterioration states. A comparison between the three formulations for up to 7 components is given in Table 1, where the computation time of each formulation is given in terms of time required to generate the necessary input and time used by the value iteration algorithm. ML denotes that a memory limit was exceeded before the process was completed.

Table 1: Computation times in seconds, with L = 3 and M = 5.

Flat State Space Formulation Factored Formulation Unique States Formulation N Input Solution Total Time Input Solution Total Time Input Solution Total Time 2 0.18 0.51 0.69 0.01 0.02 0.03 0.02 0.02 0.04 3 12.63 2.44 15.07 0.03 0.29 0.32 0.05 0.2 0.25 4 ML - - 1.23 107.04 108.27 0.5 3.14 3.64 5 ML - - 13.03 9694.57 9707.6 4.81 18.72 23.53 6 ML - - ML - - 96.25 227.61 323.86 7 ML - - ML - - 1887.7 1439.2 3326.9

(24)

When comparing the unique deterioration states formulation to the factored formulation1, we see that initially, for a low number of components, the unique states formulation takes slightly longer to generate the input. Though the set of deterioration probabilities Pd has reduced in size, it consists of sums of el-ements from the matrix Pd of the factored formulation, and thus it will take approximately the same amount of time or slightly longer to generate. Further-more determining the intermediate state will take slightly longer, since after maintenance actions have been applied to the deterioration levels of individual components, these deterioration levels may no longer be given in ascending or-der. An additional sorting step is then needed to ensure that the intermediate deterioration states are in the same form as the original states. In terms of input generation, the unique states formulation has the benefit that there are fewer deterioration states for which intermediate states and rewards must be gener-ated for every action. For higher number of components, starting at N = 4, this benefit starts to outweigh the downsides of the unique states formulation, and we see that from this point onward it generates the input more quickly than the factored formulation. Moreover, though generating Pd will take as long as previously, the reduction in its size ensures that the memory limit is not reached as quickly; in fact the memory limit will no longer be an issue, and the only limiting factor for the unique states formulation is the computation time required.

The time it takes to find the optimal solution through the value iteration al-gorithm has also diminished greatly. Because of the smaller state space, the utilities of fewer states need to be calculated, and the computation of each utility uses a smaller number of elements and thus takes fewer basic operations. The reduction in deterioration states as a result of the unique states formulation becomes greater as the number of states increases. The gap in computational efficiency between the factored and unique states representation thus widens as the number of components increases.

In Table 2, the computation times of the three formulations are given for lead times L ranging from 2 to 9. For this analysis a system of 3 components was used, and again the number of deterioration levels per component is 5.

Looking at the influence of an increasing lead time on the computation time required, we see some major differences compared to the effect of increasing the number of components. For all three formulations, an increase in the lead time affects computation time less than an increase in the number of components. Effects are particularly small for the factored and unique states representation, as the lead time only determines the set of inventory states, and changes in these states are deterministic. Slightly larger jumps in computation time for these two formulations can be seen when L is increased from 3 to 4 and from 4 to 5. This occurred as for these steps P had to be raised to 3 and to 4

re-1_{Note that the unique deterioration states formulation also uses a factorisation of the state}

(25)

Table 2: Computation times in seconds, with N = 3 and M = 5.

Flat State Space Formulation Factored Formulation Unique States Formulation L Input Solution Total Time Input Solution Total Time Input Solution Total Time 2 2.94 1.5 4.44 0.02 0.14 0.16 0.03 0.1 0.13 3 12.63 2.44 15.07 0.03 0.29 0.32 0.04 0.2 0.24 4 168.11 6.08 174.19 0.08 1.1 1.18 0.06 0.81 0.87 5 523.87 11.1 534.97 0.14 1.82 1.96 0.09 1.31 1.4 6 1417.01 18.27 1435.28 0.22 2.43 2.65 0.11 2.02 2.13 7 28704.12 84.23 28788.35 0.31 4.86 5.17 0.13 3.93 4.06 8 TL - - 1.66 27.32 28.98 0.75 19.5 20.25 9 TL - - 2.75 43.09 45.84 1.44 33.74 35.18

spectively. Another difference is that no memory limit was exceeded, instead, a time limit (TL) of approximately 8 hours was exceeded for the flat state space formulation with a lead time of 8 or higher. The transition probabilities of this formulation are stored as lists of sparse matrices, with one sparse matrix for each action. As the lead time increases, the number of states increases, but the number that can be reached directly from any state does not. The fraction of 0 values in the matrix is thus higher than when the number of components is increased; the matrix becomes more sparse. All these zeroes must still be computed individually when using the flat state space formulation, so the com-putation time is not affected much by how sparse a matrix is. The memory required is however, and therefore the time limit is exceeded before the memory limit in this case. Finally, we see that though the unique states formulation is still the fastest, the difference with the factored formulation is quite small. This is because the unique states formulation reduces the number of deterioration states, which are unaffected by the lead time. As the number of components used is only 3, the benefits of the unique states formulation over the factored formulation are limited.

(26)

Table 3: Computation times in seconds, with N = 3 and L = 3.

Flat State Space Formulation Factored Formulation Unique States Formulation M Input Solution Total Time Input Solution Total Time Input Solution Total Time 4 3.35 1.21 4.56 0.02 0.09 0.11 0.01 0.08 0.09 5 12.63 2.44 15.07 0.03 0.29 0.32 0.04 0.2 0.24 6 28.18 3.25 31.43 0.05 0.62 0.67 0.06 0.28 0.34 7 69.3 6.78 76.08 0.07 1.66 1.73 0.08 0.43 0.51 8 154.85 12.24 167.09 0.14 8.04 8.18 0.13 0.72 0.85 9 309.69 18.81 328.5 0.21 26.99 27.2 0.2 1.31 1.51 10 580.78 36.89 617.67 0.33 59.86 60.19 0.35 2.63 2.98 11 ML - - 0.44 128.38 128.82 0.45 3.88 4.33 12 ML - - 0.63 254.12 254.75 0.69 6.32 7.01

(27)

5.2 Solution Analysis

5.2.1 Optimal Policy

In the following, the optimal solution along with alternative solutions will be generated and compared to each other. First, a representation of the optimal policy for a system with 2 components and 5 deterioration levels per component is shown in Tables 4a-j. A system of only 2 components is used because policies for larger systems would be more difficult to illustrate graphically. For similar reasons, the policy for the factored representation is shown, as the symmetrical deterioration states of this form show a clearer structure in the optimal policy. Besides using the factored formulation to find this solution, it is also possible to use the more efficient unique states formulation and convert the policy for the unique states to a policy for the full set of all initial deterioration states. This way, the benefit of efficient solution generation of the unique states formulation can be combined with the more intelligible solutions of the factored formulation. Tables 4a-j show the complete optimal policy, with the cost parameters set as in the previous subsection (i.e. cr = ci = 5, cs = 2, cd = 100, ch = 0.5, and co= 1). These specific parameters were set with the following ideas in mind.

• Failures should be very undesirable.

• Imperfect Maintenance has the benefit of not requiring a spare part and not incurring setup costs. As such, replacements should be more cost efficient in improving a component’s deterioration level, so that both types of maintenance have their benefit and will be present in the optimal policy.

• Holding costs should be high enough such that it is not feasible to always keep a high number of spare parts on stock, but not so high that inventories are almost never used.

These all ensure that both types of maintenance and effective planning of in-ventory are able to contribute to the optimal solution. Each table gives for one inventory state, the optimal action for every deterioration state. The inventory state is thus given for each table, in the form (s0, s1, s2), where s0 denotes the spare parts on stock, and s1 and s2 the number of spare parts that will arrive in one or two periods respectively. The deterioration state consists of a deteri-oration level for components 1 and 2. The vertical axis of each table gives the deterioration level of component 1 while the horizontal axis gives the deterio-ration level of component 2. All actions consist of three parts: a maintenance action for component 1, a maintenance action for component 2, and the number of spare parts to be ordered. A maintenance action for a component can either be Do Noting (D), Imperfect Maintenance (I), or Replace (R).

(28)

Table 4: Optimal Policy, with for each state a maintenance action for both components and the number of spare parts to be ordered.

1 2 3 4 5 1 2 3 4 5 1 D D 0 D D 1 D I 1 D I 1 D D 2 1 D D 0 D D 0 D R 0 D R 0 D R 0 2 D D 1 D D 2 D I 2 D I 2 D D 2 2 D D 0 D D 0 D R 1 D R 1 D R 1 3 I D 1 I D 2 I I 2 I I 2 I D 2 3 R D 0 R D 1 I R 1 I R 1 I R 1 4 I D 1 I D 2 I I 2 I I 2 I D 2 4 R D 0 R D 1 R I 1 I R 1 I R 1 5 D D 2 D D 2 D I 2 D I 2 D D 2 5 R D 0 R D 1 R I 1 R I 1 R D 2 1 2 3 4 5 1 2 3 4 5 1 D D 0 D D 0 D R 0 D R 0 D R 0 1 D D 0 D D 0 D D 0 D I 0 D D 0 2 D D 0 D D 0 D R 0 D R 0 D R 0 2 D D 0 D D 0 D D 1 D I 1 D D 1 3 R D 0 R D 0 R R 0 R R 0 R R 0 3 D D 0 D D 1 I D 1 I I 1 I D 1 4 R D 0 R D 0 R R 0 R R 0 R R 0 4 I D 0 I D 1 I I 1 I I 1 I D 1 5 R D 0 R D 0 R R 0 R R 0 R R 0 5 D D 0 D D 1 D I 1 D I 1 D D 1 1 2 3 4 5 1 2 3 4 5 1 D D 0 D D 0 D R 0 D R 0 D R 0 1 D D 0 D D 0 D D 0 D I 0 D D 0 2 D D 0 D D 0 D R 0 D R 0 D R 0 2 D D 0 D D 0 D D 0 D I 0 D D 0 3 R D 0 R D 0 D D 0 D R 0 D R 0 3 D D 0 D D 0 D D 0 D I 0 D D 0 4 R D 0 R D 0 R D 0 I R 0 I R 0 4 I D 0 I D 0 I D 0 I I 0 I D 0 5 R D 0 R D 0 R D 0 R I 0 R D 0 5 D D 0 D D 0 D D 0 D I 0 D D 0 1 2 3 4 5 1 2 3 4 5 1 D D 0 D D 0 D I 0 D I 0 D D 0 1 D D 0 D D 0 D R 0 D R 0 D R 0 2 D D 0 D D 0 D I 0 D I 1 I D 0 2 D D 0 D D 0 D R 0 D R 0 D R 0 3 I D 0 I D 0 I I 0 I I 1 I D 1 3 R D 0 R D 0 I R 0 I R 0 I R 0 4 I D 0 I D 1 I I 1 I I 1 I D 1 4 R D 0 R D 0 R I 0 I R 0 I R 0 5 D D 0 D I 0 D I 1 D I 1 D D 1 5 R D 0 R D 0 R I 0 R I 0 R D 0 1 2 3 4 5 1 2 3 4 5 1 D D 0 D D 0 D D 0 D I 0 D D 0 1 D D 0 D D 0 D I 0 D I 0 D D 0 2 D D 0 D D 0 D D 0 D I 0 D D 0 2 D D 0 D D 0 D I 0 D I 0 D D 0 3 D D 0 D D 0 D D 0 D I 0 I D 0 3 I D 0 I D 0 I I 0 I I 0 I D 0 4 I D 0 I D 0 I D 0 I I 0 I D 0 4 I D 0 I D 0 I I 0 I I 0 I D 0 5 D D 0 D D 0 D I 0 D I 0 D D 0 Det. L 5 D D 0 D D 0 D I 0 D I 0 D D 0 eve l 1 De t. L eve l 1 De t. L eve l 1 De t. L eve l 1 De t. L eve l 1 De t. L eve l 1

Det. Level 2 Det. Level 2

De t. L eve l 1 De t. L eve l 1 De t. L eve l 1 Det. Level 2

Table 4g: (s0,s1,s2) = (0,0,1) Table 4h: (s0,s1,s2) = (1,0,1)

Table 4i: (s0,s1,s2) = (0,1,1) Table 4j: (s0,s1,s2) = (0,0,2)

Table 4a: (s0,s1,s2) = (0,0,0) Table 4b: (s0,s1,s2) = (1,0,0)

Table 4c: (s0,s1,s2) = (2,0,0) Table 4d: (s0,s1,s2) = (0,1,0)

Table 4e: (s0,s1,s2) = (1,1,0) Table 4f: (s0,s1,s2) = (0,2,0)

De

t. L

eve

l 1

Det. Level 2

(29)

deterioration level of 3 or higher. Imperfect maintenance is thus used to reduce the risk of a failure when no spare part is available for a replacement. We can confirm this usage of imperfect maintenance by viewing the optimal policy for the system where imperfect maintenance cannot be used. This policy is shown in Table 5 in Appendix A using the same layout as Table 4. We see that replace-ments are not performed at earlier deterioration levels, but changes do arise in the ordering of spare parts. Without imperfect maintenance, spare parts are ordered sooner, i.e. at lower deterioration levels, to reduce the risk of a failure occurring when no spare parts are available. Table 4 shows that when imper-fect maintenance can be performed, it is used to reduce the risk of failure while waiting for a spare part to arrive, as we we see that when imperfect maintenance is performed, a spare part is either ordered or is already on order. When the spare part arrives, a replacement action is selected over imperfect maintenance, for components with deterioration levels of 3 or higher. Replacements are more cost efficient than imperfect repairs, as they cost the same but have a bigger impact on the deterioration level, and are thus preferred when a spare part is available. When a spare part is available, replacements also have the benefit that they reduce holding costs incurred in future periods by removing a spare part from stock. Having an empty stock can indeed be desirable, as Table 4a shows that when both components are in the good as new state, it is optimal to not order any spare parts and keep the number of spare parts on stock and on order at 0. This only holds true when imperfect maintenance can be performed, as Table 5a shows that a spare part is ordered even when both components are in the good-as-new state. If, for the system with imperfect maintenance, a single component deteriorates beyond the good as new level, a spare part is ordered, and if it deteriorates further imperfect maintenance is performed to reduce the risk of failure before this spare part arrives. If both spare parts deteriorate from the good as new state however, 2 spare parts are immediately ordered. We can conclude this is done to save ordering costs, as in Table 4b we see that when both components are at deterioration level 2, it is not yet necessary to order an additional spare part, and the single spare part in stock suffices for now.

5.2.2 Solution Costs

The optimal costs per period related to the given policy for the 2 component system are given in Figure 1, along with the costs of deviating strategies, and the optimal costs for the system without imperfect maintenance. The joint opti-mal (s, S) policy was found by testing potential candidates for the safety stock s and order-up-to-level S, finding the optimal cost for each possible combination, and subsequently selecting the (s, S) policy that had the lowest costs related to it. The best jointly optimised (s, S) policy was found to use a safety stock s = 0 and order-up-to level S = 1, while the optimal (s, S) policy for the sequentially optimised system was found for s = 1 and S = 2.

(30)

Figure 1: Cost decomposition of six different strategies.

(s, S) ordering policy, costs increased only marginally. First of all these low costs result from the fact that while a sub-optimal ordering policy is used, the rest of the policy is fully optimised around this policy. Furthermore, the possi-bility of performing imperfect maintenance provides additional flexipossi-bility to the maintenance decisions, as this can be performed to prevent potential failures even when no spare parts are present. This is shown by the optimal costs for the system where IM cannot be performed. Costs increase significantly due to the loss of IM, mainly through increased holding costs and replacement costs. The optimal policy in Table 5a indeed shows that spare parts are ordered at lower deterioration levels, explaining the increase in holding costs. Replace-ments are performed at the same states as in the system with IM, but without IM theses states are reached sooner, causing replacements to be performed more frequently. The effect of the loss of IM is even greater for the optimal (s, S) policy, as its costs increase to 2.25, as IM is no longer available to reduce the costs resulting from a sub-optimal ordering strategy. The optimal order-up-to level has as a result increased from 1 to 2, so more spare parts are present on average to cope with the loss of imperfect maintenance to reduce the risk of failure, but this does greatly increase holding costs. Ordering costs are reduced however, as spare parts are in this case always ordered 2 at a time.

(31)

(32)

5.3 Sensitivity Analysis

5.3.1 Influence of the Cost of Imperfect Maintenance

In the optimal costs shown in Figure 1, the mean costs per period are 0.82 for replacements and 0.26 for imperfect maintenance. Since the cost of a replace-ment is the same as for imperfect maintenance, it follows that components are replaced almost three times as often as they are imperfectly repaired. We will now vary the cost of imperfect maintenance, to analyse at what point both main-tenance actions are performed at the same frequency, and when mainmain-tenance in the optimal policy consists of only one of the two actions. Note that imperfect maintenance will never be performed when its price is high enough, but even when imperfect maintenance is very cheap, replacements will still sometimes be necessary, as some chance of a component failing always exists. All other parameters remain constant and use their previously stated values. Figure 2 shows the replacement costs, IM costs, and total cost of the optimal policy for several levels of ci.

Figure 2: Costs for various levels of cost for Imperfect Maintenance

(33)

this changes to 0.82 per period for replacements and 0.25 for IM. This change in which maintenance type is most dominantly used might become smoother if more deterioration levels are used per component. When the number of replace-ments is low, few spare parts are required. So when IM is performed mostly, not only the replacements costs per period, but also costs for ordering, holding and setups will be low. Therefore the IM costs are close to the total costs in these cases. Finally, even when IM is relatively expensive, it can still cause a reduction in total costs by occasionally being used to prevent failures. At ci = 8, IM is rarely used, but total costs are 2.07, compared to 2.11 when IM is never used. Though IM is at this point relatively expensive, it is helpful in reducing the risk of even more expensive downtime costs being incurred. Even when ci= 27, so more than 5 times as expensive as replacing a component, IM still provides some benefit, as total costs are 2.10 at this point. Above this level, performing IM is no longer advantageous.

Figure 3: Total costs of 4 strategies for various levels of cost for Imperfect Maintenance

(34)

to high holding costs. Not having this spare part on stock however is not a possibility, as in this case a spare part would never be available or ordered, even when a component is in the failed state. As such the component would never be able to leave the failed state. A spare part is thus always kept on stock to prepare for the rare occasion of a failure. The optimal policy shows however that in this case it would be better to reduce the risk of failures as much as possible through IM, and only order a spare in the unlikely event that a failure still occurs.As explained earlier, the sequentially optimised policies typically do not make use of IM. Therefore the costs are constant when ci is 4 or higher. For lower costs of IM, the sequentially optimised policies start making use of it. For the optimal sequential policy costs decrease quickly at this point, as the policy makes effective use of the cheapyl available IM. This does not hold for the sequential (s, S) policy however. The maintenance policy switches from using only replacements to using mostly IM. Spare parts are still kept on stock how-ever in case replacements must be performed. Using the same inventory policy but with fewer replacements greatly increased holding costs, which is why this policy shows an increase in costs as ci is lowered from 4 to 3.5. For even lower levels of cithe costs do get lower, as the benefit of cheap IM starts to outweigh the increased holding costs.

(35)

5.3.2 Influence of the Number of Deterioration Levels

Next, the number of deterioration levels M has been increased to assess whether this makes the transition between using mostly replacements or mostly imper-fect maintenance smoother. Increasing the number of states however without altering other parameters diminishes the effect of IM. In the previous analy-sis, IM improved the deterioration level of a component by one level, and a replacement by at most three for a preventive replacement. To ensure that the same ratio is kept, M has been increased to 11 while δ has been increased to 3. Furthermore, increasing M without altering λ inadvertently increases the expected lifetime of components. As this would not allow for a fair comparison of costs, the level of λ was varied with M such that the expected lifetime always remained at 20. The resulting costs are shown in Figure 4.

Figure 4: Costs for various levels of ci with M = 13 and δ = 3.

(36)

more detailed information on the deterioration levels of components, ensuring a timely arrival of spare parts for replacements becomes easier. IM still remains an effective measure to reduce costs however, as similarly to the previous system with M = 5, IM is being used in the optimal policy even for high levels of ci. Looking at the costs per period for the optimal policy, we indeed see in Figure 5 that total costs decrease as M increases, though the benefits do diminish. For high values of M , costs for replacements constitute more than two thirds of total costs. Replacements will still have to be performed at a similar rate since the lifetime of components is unaffected, but the more accurate information regard-ing deterioration state allows for it to be planned more efficiently. We see that setup costs have reduced by more than replacement costs, thus replacements are clustered more frequently. Holding costs also decrease rapidly as M increases, so spare parts are often used immediately upon arrival. Downtime costs have also reduced greatly, even though maintenance is performed less frequently. Im-perfect maintenance costs still make up a significant part of total costs, so while having more accurate information on the system state can be very advanta-geous, deterioration still occurs randomly, and imperfect maintenance remains beneficial as it reduces the risk of failures.

(37)

The influence of changes in M on the alternative policies and on the optimal policy without IM, shown in Figure 6, showed similar results, all benefited from having more detailed information on the deterioration state. Cost reductions in the optimal sequential policy and the optimal policy without IM were very close to the reductions in the optimal policy. For the (s, S) and sequential (s, S) policies however, costs decreased more slowly. This is because for these policies only the maintenance decisions benefit from the improved information on the deterioration levels, while the inventory decisions are unaffected.

(38)

5.3.3 Influence of the Lead Time

When the lead time increases, it becomes more important to adequately plan ahead so that spare parts are available when replacements are required. Decom-positions of costs for various levels of L are given Figure 7, while a comparison between the optimal policy, optimal policy without IM, and three alternative policies are shown in Figure 8. A 2 component system with 8 deterioration levels and δ = 2 was used. This change was made as a higher number of deteriora-tion states was able to more clearly illustrate the influence of L. With M = 5, changes in L more frequently did not affect the policies. All other parameters were kept at the levels used initially.

Figure 7: Costs of optimal policy for several levels of L.

(39)

performing replacements later. For higher levels of L however, doing this would increase holding and downtime costs by too much, and therefore more replace-ments (and more IM) is performed for lead times of 6 and 7. The increase in IM helps by ensuring more clustering can be done without increasing the risk of failures, as the downtime is reduced again for higher levels of L. Figure 8 now shows the same total costs for the optimal policy, along with total costs per period of the three alternative policies.

Figure 8: Costs of five different policies for several levels of L.

(40)

(41)

5.3.4 Influence of the Number of Components

Finally, the number of components N will be varied to assess how this affects costs. A decomposition of the optimal costs for several levels of N is given in Figure 9. To make a fair comparison, the costs per period per component are shown instead of the total costs per period. A level of M = 5 was again used to save computation time for systems with many components. All other parameter settings were kept as in the previous analyses.

Figure 9: Costs of optimal policy for several levels of N.

We see that the cost per component decreases as the number of components increases. Part of this is because ordering spare parts and using them for re-placements can more often be clustered, reducing the ordering and setup costs per component. Furthermore, larger systems benefit more from the shared pool of spare parts. A larger inventory can be kept for the same holding costs per component. As a result, a spare part is more frequently available to be used for a replacement. This reduces costs for imperfect maintenance, as IM was mostly used when no spare parts were available. We thus see an increase in the number of replacements performed and a decrease in IM as the number of components increases. The holding costs tend to decrease with N as well, but sometimes an increase occurs as higher inventory levels are required for larger systems. This causes the decrease in holding costs in small steps for increasing N , with occasionally an increase in a larger step.

(42)

Furthermore, the (s, S) policy can only order a spare part when replacements are performed. With more components, it happens more frequently that one or more components are replaced, providing more possibilities for ordering spare parts. Note however that for different values of M or ci the difference with the optimal policy in terms of costs will be bigger, as was shown previously. The sequentially optimised policies also acquired similar savings in costs per component per period as more components were added. Though the frequency of performing IM per component diminished with N in the optimal policy, the costs of the optimal policy without IM did not move closer to those of the op-timal policy. Clustering maintenance and ordering becomes more important in larger systems, and IM is able to ensure maintenance can be clustered effectively, without replacing parts too soon or increasing the risk of failures.

(43)

6 Conclusion

In this thesis, a joint maintenance and inventory decision problem has been for-mulated as a Markov Decision Process to subsequently find the optimal solution using the value iteration algorithm. Whereas only a single form of maintenance, namely a replacement, is usually considered in these types of models, the ad-ditional possibility to perform imperfect maintenance was also included in this thesis. This increased the flexibility and robustness of solutions, specifically since the imperfect maintenance action does not require a spare part, while the replacement does. Results showed that imperfect maintenance was frequently used in the optimal policy, with the primary purpose of reducing the risk of failures when no spare part was present. As such, a timely preventive replace-ment could still be performed at a later stage. The ability to perform imperfect maintenance not only lowered total costs by reducing the risk of failure and thus downtime costs, but caused reductions in all other cost categories as well. Hold-ing costs were reduced as lower inventories were required, replacements could be performed less frequently, and replacements and spare parts orders could more frequently be clustered, reducing both setup and ordering costs.

Alongside the optimal policy, several alternative policies were generated for comparison. The resulting costs from these policies showed that a full jointly optimised maintenance and inventory policy offered significant benefits over using an (s, S) policy or a sequentially optimised maintenance and inventory policy. The sequentially optimised policies especially performed worse, caused mostly by the fact that they did not make use of imperfect maintenance. This again shows that while imperfect maintenance may not be the most cost-efficient maintenance action, it provides its benefits through not requiring a spare part. A joint optimisation of maintenance and inventory is thus required to make full use of the potential of imperfect maintenance.

In addition to imperfect maintenance and its impact on the optimal solution, the second aspect of focus in this thesis was computational efficiency. Results clearly indicated that making effective use of a system’s structure could greatly reduce computation time required and allow for larger systems to be solved. First of all, formulating the model as a factored Markov decision Process instead of us-ing the traditional flat state space formulation greatly improved computational efficiency. This formulation was then further improved by making use of the fact that when components are identical, duplicates states exist for multi-component systems. Removal of these duplicate states resulted in the unique states formu-lation, which reduced computation time even further and allowed for systems of up to 8 components to be solved.

(44)

(45)

References

Bengtsson, M. (2007). On condition based maintenance and its implementa-tion in industrial settings. Doctoral Thesis, M¨alardalens H¨ogskola University, Sweden.

Camci, F. (2009). System maintenance scheduling with prognostics information using genetic algorithm. IEEE Transactions on reliability 58 (3), 539–552.

Dekker, R., R.E. Wildeman, and F.A. Van der Duyn Schouten (1997). A review of multi-component maintenance models with economic dependence. Mathe-matical Methods of Operations Research 45 (3), 411–435.

Dhillon, Balbir S (2002). Engineering maintenance: a modern approach. CRC Press.

Hong, H.P., W. Zhou, S. Zhang, and W. Ye (2014). Optimal condition-based maintenance decisions for systems with dependent stochastic degradation of components. Reliability Engineering & System Safety 121, 276–288.

Kolobov, A. (2012). Planning with markov decision processes: An ai perspec-tive. Synthesis Lectures on Artificial Intelligence and Machine Learning 6 (1), 1–210.

Nicolai, R.P. and R. Dekker (2008). Optimal maintenance of multi-component systems: a review. In Complex system maintenance handbook, pp. 263–286. Springer.

Olde Keizer, M.C.A. and R.H. Teunter (2014). Opportunistic Condition-based Maintenance and Aperiodic Inspections for a Two-unit Series System. Uni-versity of Groningen, Faculty of Economics and Business.

Olde Keizer, M.C.A., R.H. Teunter, and J. Veldman (2016). Joint condition-based maintenance and inventory optimization for systems with multiple com-ponents. European Journal of Operational Research.

Pham, H. and H. Wang (1996). Imperfect maintenance. European journal of operational research 94 (3), 425–438.

Prajapati, A., J. Bechtel, and S. Ganesan (2012). Condition based maintenance: a survey. Journal of Quality in Maintenance Engineering 18 (4), 384–400.

Spieksma, F. and R. N´unez-Queija (2015). Markov decision processes. Lecture Notes LNMB .

Stanley, R.P. (1986). What is enumerative combinatorics? In Enumerative Combinatorics, pp. 1–63. Springer.

(46)

7 Appendix A

Table 5: Optimal Policy for system without IM, with for each state a mainte-nance action for both components and the number of spare parts to be ordered.

1 2 3 4 5 1 2 3 4 5 1 D D 1 D D 2 D D 2 D D 2 D D 2 1 D D 0 D D 0 D R 1 D R 1 D R 1 2 D D 2 D D 2 D D 2 D D 2 D D 2 2 D D 0 D D 1 D R 2 D R 2 D R 2 3 D D 2 D D 2 D D 2 D D 2 D D 2 3 R D 1 R D 2 D D 1 D R 2 D R 2 4 D D 2 D D 2 D D 2 D D 2 D D 2 4 R D 1 R D 2 R D 2 D D 1 D R 2 5 D D 2 D D 2 D D 2 D D 2 D D 2 5 R D 1 R D 2 R D 2 R D 2 R D 2 1 2 3 4 5 1 2 3 4 5 1 D D 0 D D 0 D R 0 D R 0 D R 0 1 D D 0 D D 0 D D 1 D D 1 D D 1 2 D D 0 D D 0 D R 0 D R 0 D R 0 2 D D 0 D D 1 D D 1 D D 1 D D 1 3 R D 0 R D 0 R R 1 R R 1 R R 1 3 D D 1 D D 1 D D 1 D D 1 D D 1 4 R D 0 R D 0 R R 1 R R 1 R R 1 4 D D 1 D D 1 D D 1 D D 1 D D 1 5 R D 0 R D 0 R R 1 R R 1 R R 1 5 D D 1 D D 1 D D 1 D D 1 D D 1 1 2 3 4 5 1 2 3 4 5 1 D D 0 D D 0 D R 0 D R 0 D R 0 1 D D 0 D D 0 D D 0 D D 0 D D 0 2 D D 0 D D 0 D R 0 D R 0 D R 0 2 D D 0 D D 0 D D 0 D D 0 D D 0 3 R D 0 R D 0 D D 0 D R 1 D R 1 3 D D 0 D D 0 D D 0 D D 0 D D 0 4 R D 0 R D 0 R D 1 R D 1 D R 1 4 D D 0 D D 0 D D 0 D D 0 D D 0 5 R D 0 R D 0 R D 1 R D 1 R D 1 5 D D 0 D D 0 D D 0 D D 0 D D 0 1 2 3 4 5 1 2 3 4 5 1 D D 0 D D 0 D D 1 D D 1 D D 1 1 D D 0 D D 0 D R 0 D R 0 D R 0 2 D D 0 D D 1 D D 1 D D 1 D D 1 2 D D 0 D D 0 D R 0 D R 0 D R 0 3 D D 1 D D 1 D D 1 D D 1 D D 1 3 R D 0 R D 0 D D 0 D R 1 D R 1 4 D D 1 D D 1 D D 1 D D 1 D D 1 4 R D 0 R D 0 R D 1 D R 1 D R 1 5 D D 1 D D 1 D D 1 D D 1 D D 1 5 R D 0 R D 0 R D 1 R D 1 R D 1 1 2 3 4 5 1 2 3 4 5 1 D D 0 D D 0 D D 0 D D 0 D D 0 1 D D 0 D D 0 D D 0 D D 0 D D 0 2 D D 0 D D 0 D D 0 D D 0 D D 0 2 D D 0 D D 0 D D 0 D D 0 D D 0 3 D D 0 D D 0 D D 0 D D 0 D D 0 3 D D 0 D D 0 D D 0 D D 0 D D 0 4 D D 0 D D 0 D D 0 D D 0 D D 0 4 D D 0 D D 0 D D 0 D D 0 D D 0 5 D D 0 D D 0 D D 0 D D 0 D D 0 5 D D 0 D D 0 D D 0 D D 0 D D 0

Table 5i: (s0,s1,s2) = (0,1,1) Table 5j: (s0,s1,s2) = (0,0,2)

De t. L eve l 1 De t. L eve l 1 Table 5g: (s0,s1,s2) = (0,0,1) Table 5h: (s0,s1,s2) = (1,0,1)

De t. L eve l 1 De t. L eve l 1

Table 5e: (s0,s1,s2) = (1,1,0) Table 5f: (s0,s1,s2) = (0,2,0)

De t. L eve l 1 De t. L eve l 1 Table 5c: (s0,s1,s2) = (2,0,0) Table 5d: (s0,s1,s2) = (0,1,0)

De t. L eve l 1 De t. L eve l 1

Table 5a: (s0,s1,s2) = (0,0,0) Table 5b: (s0,s1,s2) = (1,0,0)

Joint Optimisation of Maintenance and Inventory with Perfect and Imperfect Repairs