Joint Optimization of Production and Maintenance Decisions for a Two-Machine Production Line

(1)

Joint Optimization of Production and Maintenance Decisions

for a Two-Machine Production Line

Gabe-Hein Dijkstra

(2)

Master’s Thesis Econometrics, Operations Research and Actuarial Studies Track: Operations Research

Supervisor: Dr. B. de Jonge

(3)

Joint Optimization of Production and Maintenance Decisions for

a Two-Machine Production Line

Gabe-Hein Dijkstra

Abstract

In most industrial settings, efficient planning of production and maintenance is essential. This paper considers a production line consisting of two machines, an intermediate buffer, and an inventory with limited capacity. The goal is to minimize long-run average cost while satisfying stochastic demand. Condition-based production and maintenance decisions are jointly optimized. The problem is formulated as a Markov decision process. Both determin-istic and stochastic maintenance durations are considered. The value iteration algorithm is used to obtain optimal decisions. Furthermore, we prove that maintenance is carried out according to a control-limit policy. A numerical analysis shows that the optimal strategy for each machine depends on the deterioration states of both machines and the situation at the buffer and inventory. Moreover, the numerical analysis provides insights in the sensitivity of the optimal policy.

1 Introduction

Machines are subject to unpredictable failures, after which costly and time-consuming corrective maintenance needs to be performed. Whenever a machine is under repair, it is not available to process any items. Hence, the fact that machines deteriorate randomly causes a high vari-ability in machine availvari-ability. This negatively impacts the efficiency of the machine (Hopp and Spearman, 2008). Preventive maintenance is used to reduce the variability in the machine’s availability. However, performing preventive maintenance too frequently leads to low availability and high maintenance costs as well. This trade-off should be kept in mind when formulating a maintenance policy.

(4)

is only performed when needed. Next to production plants, CBM policies can be applied to aerospace components, power plants, offshore installations, and many more industries.

Furthermore, when optimizing maintenance scheduling, it is crucial to keep the bigger pic-ture in mind. It is also important to create an efficient production schedule. Understocking may lead to missed sales while overstocking is accompanied with high inventory costs. This prob-lem becomes especially complex to solve in systems with items that are demanded sporadically. Examples are spare parts for airplanes and large construction equipment. It is estimated that worldwide approximately $1.1 trillion is lost due to inventory distortions (Buzek, 2015).

The decisions regarding production and maintenance are interconnected. Performing pre-ventive maintenance reduces production capacity, as the machine cannot process any items while the machine is being maintained. On the other hand, it helps preventing failures that would dissipate even more production capacity. Note that this relation also has an effect that works vice versa; production causes deterioration of equipment which leads to the the need of mainte-nance. Optimizing production and maintenance decisions separately may not yield satisfactory results. In current research, the joint optimization of production and maintenance is under-exposed. Production and maintenance planning are typically treated separately. In practice, many organizations face these kinds of multi-faceted problems. Again, the supplier of large construction equipment is a suitable example.

In many industrial settings, a production line contains several machines connected in series. Usually buffers are installed between the machines. When considering multiple machines in series, the production rate of the entire system depends on the states of all machines. If a machine is down, items are not able to get processed by this machine and continue through the production line. In addition, a machine cannot process items when the upstream buffer is empty or the downstream buffer is full. This, on the other hand, may provide a maintenance opportunity. Due to these dependencies between machines, finding optimal production and maintenance strategies for multi-machine systems is more complicated than for single-machine systems. When deciding the optimal strategy for a machine, the condition of the other machines and the buffer levels need to be taken into account. Furthermore, the large number of possible policies increases the difficulty of determining the optimal one.

(5)

un-successful and has to be redone. Regarding maintenance, we consider both deterministic and stochastic durations. Two models will be formulated. In the first model, the durations of pre-ventive and corrective maintenance are considered deterministic. These durations are stochastic in the second model. The purpose of these models is to analyze structural properties and obtain a policy that jointly optimizes production and maintenance considering a cost perspective and an infinite time horizon.

The remainder of the paper is structured as follows. First, in Section 2, the existing liter-ature about this topic is reviewed and discussed. Thereafter, in Section 3, the specifications of the problem are formulated. We formulate the problem as a Markov decision process. In Section 4, we assume maintenance durations to be deterministic. In Section 5, we consider stochastic maintenance durations. In both of these sections the corresponding value iteration algorithm is specified which can be used to simultaneously optimize the production and maintenance deci-sions. Section 6 contains structural properties of the optimal policy. Numerical experiments are carried out in Section 7. In Section 8, we present the final conclusions and remarks.

2 Literature Review

We review literature on the joint optimization of production and maintenance decisions. We first discuss the literature on single-machine systems. This can be subdivided into multiple categories. First, we discuss literature that does not consider demand as a limiting factor. Second, we review studies which consider known demand. Third, we examine literature which considers random demand. Finally, the current research on multi-machine production lines is reviewed. An overview of the current progress in the field of maintenance modeling and optimization is given by De Jonge and Scarf (2020).

(6)

a shorter processing time. As the machine deteriorates the yield for each product decreases. Among other things, they provide critical ratios for the firm’s manufacturing decision at each state. Cassady and Kutanoglu (2005) consider a machine that is minimally repaired upon failure. They apply an age-based preventive maintenance policy. The objective is to minimize the total weighted completion time by choosing an optimal job sequence. Yulan et al. (2008) simultaneously consider five objectives when determining the job sequence including minimizing maintenance costs, total weighted completion time of jobs, total weighted tardiness, makespan and maximizing machine availability. There is an option to perform preventive maintenance before every job.

The following studies also consider a single machine, but assume there is deterministic de-mand that needs to be satisfied. Najid et al. (2011) consider a multi-item capacitated lot-sizing problem with setup costs. Furthermore, they allow for demand shortages. Preventive mainte-nance is done periodically and a minimal repair is performed upon failure. The objective is to minimize the expected production and maintenance costs. Zhao and Wang (2014) consider a similar setting, but assume operation-dependent failures. Besides that, they assume that min-imal repair requires time and allow for non-periodic preventive maintenance schedules. Kazaz and Sloan (2013) consider demand constraints, requiring a minimum and maximum amount of items to be produced of every product type. In addition, based on the machine condition, different types of maintenance are used which differ in cost incurred, expected down time, and impact on the condition of the machine. Sheu et al. (2015) compare multiple corrective main-tenance options in a model with periodic preventive mainmain-tenance. They include the possibility for preventive maintenance to be imperfect and consider a reliability measure that indicates the ability of the system to meet the customer demand. Assid et al. (2015) consider multiple prod-ucts with different constant demand rates and assume switching between producing different types of products requires a setup time. The preventive maintenance policy is based on the inventory position. Jafari and Makis (2015) also consider a constant demand rate. They assume products cannot be backlogged and introduce a cost parameter to account for lost orders. Fur-thermore, they assume that inspections are needed to observe condition information. Preventive maintenance is carried out when the failure rate or the age of the machine exceeds a certain threshold.

(7)

model. In their model, the mean processing time increases in the system state. Borrero and Akhavan-Tabatabaei (2015) formulate two Markov decision process models. The objective for the first model is based on the cost of performing maintenance and the cost of holding inventory. In the second model the holding cost is replaced by a cost for not satisfying demand. Demand arrives according to a Poisson process. Sloan (2004) formulates a Markov decision process of a single machine that can produce multiple products per period in order to satisfy random demand. The product yield is binomially distributed and depends on the deterioration state of the machine. Mifdal et al. (2013) consider various product types and divide the planning horizon into multiple periods. For each product type there exists a corresponding random demand that needs to be satisfied in every period. Preventive maintenance is performed periodically. Hajej et al. (2011) consider a similar setting, but with only one product type. The failure rate of the machine increases in both machine age and the production rate of the machine. De Jonge (2020) considers a production facility that processes jobs with different lengths. The problem is modeled as a Markov decision process. The interarrival times of jobs follow a geometric distribution. Jobs have to be processed without interruptions and are therefore non-resumable. Every time period a job is in the system a cost is incurred. The corresponding cost parameters can be used to model different job priorities.

(8)

preventive maintenance and minimal repair upon failure. The facility needs to satisfy random demand with a given service level. The failure rate is increasing in the production rate. In order to satisfy all demand, the facility is allowed to subcontract part of the production. Lu and Zhou (2017) consider economic dependence between machines in a series-parallel system. Deterioration on one machine propagates to machines downstream. To find the optimal preven-tive maintenance interval, they minimize the sum of maintenance-related costs, minimal repair costs, and costs due to quality loss.

The literature on the optimal scheduling of production and maintenance mainly focuses on infinite or deterministic demand. Some studies consider random demand, but none of these studies also consider condition information and multiple machines. Similarly, literature that examines multi-machine systems never considers both stochastic demand and condition infor-mation. Contrary to the existing literature, this paper combines all of these elements. Hence, this paper considers the joint optimization of production and maintenance decisions in a multi-machine system with condition information and demand arriving dynamically over time.

3 Problem Formulation

We consider a production line consisting of two machines decoupled by a buffer with limited capacity. At the start of the production line, there exists an infinite supply of items waiting to be processed. Each item needs to be processed by both machines sequentially. At the end of the production line, there exists an inventory in which a limited number of items can be stored. Backordering items is allowed. The goal of the production line is to satisfy random demand. Furthermore, both machines are subject to deterioration and therefore maintenance is occasionally required.

Each machine deteriorates when processing an item according to a discrete-time Markov chain with deterioration states 1, ..., m, m + 1. State 1 represents the new state, state m the most deteriorated state and state m + 1 the failed state. The (m + 1) × (m + 1) transition probability matrix used to determine the new deterioration state after processing on machine i is denoted by Pi, i ∈ {1, 2}. Each machine can process one item at a time and only deteriorates

when processing an item. The current deterioration states of the machines are observed at the start of each time period.

(9)

to these direct costs, performing maintenance takes time and therefore a machine temporarily cannot process any items. In Section 4 we consider deterministic maintenance durations. Pre-ventive (corrective) maintenance requires Tpm (Tcm) time periods. If preventive or corrective

maintenance takes a negligible amount of time, we can set Tpm = 0 or Tcm = 0. We assume

that 0 ≤ Tpm ≤ Tcm. In Section 5 we consider stochastic maintenance durations. When

ma-chine i, i ∈ {1, 2}, is being maintained there exists a probability a probability rpm,i (rcm,i) that

preventive (corrective) maintenance is successful. Furthermore, we assume that all types of maintenance are perfect and thus bring back the maintained machine to deterioration state 1. Maintenance cannot be interrupted.

To bring an additional source of uncertainty in the model and enable us to investigate scenarios where the processing efficiencies of machines 1 and 2 differ, processing success proba-bilities are introduced. When machine 1 processes an item, the probability that this procedure is successful is q1. If the processing on machine 1 fails, the machine becomes idle again. The

processing success probability for machine 2 is q2. We assume that whenever the processing of

an item on machine 2 is unsuccessful, this item is placed back into the buffer and the machine becomes idle again.

At each time period, demand of an uncertain size arrives. Let D denote the random variable representing the size of demand occurring in one time period. We assume D follows a discrete distribution with finite support and maximum value ¯D. Every time period, there is a probability P[D = j] that the total demand arriving during the time period is j, j ∈0, ..., ¯D .

The inventory position at the end of the production line is positive when there is a positive amount of stock. If there are backorders, the inventory position is negative. Hence, the inventory position can be either positive or negative. When the inventory position is positive at the end of a time period, we incur a holding cost hI per item in the inventory. When the inventory

position is negative at the end of a time period, a shortage cost bI per item is incurred. The

cost of having items in the buffer at the end of a time period is given by hB per item. Orders

are lost if the maximum number of backorders is surpassed. We then incur a cost of bL per lost

ordered item.

Furthermore, there are finite storage capacities. We therefore set a limit Bmax ≥ 0 on the

number of items in the buffer. In addition, we assume that the inventory position will be between w (the maximum number of backordered items) and Imax in all time periods, where w ≥ 0 and

Imax≥ 0. If the objective is to investigate the described system without these bounds, Bmax, w

(10)

4 Deterministic Maintenance Durations

In this section, we will formulate the problem as a Markov decision process (MDP) which can be used to optimize joint production and maintenance decisions. We will consider the problem from a cost perspective and with an infinite time horizon.

4.1 The State Space

Let S be the state space. First of all, the state of the system is characterized by the current deterioration states of the machines. These will be denoted by X1 and X2 for machines 1 and

2, respectively. Let Yi, i ∈ {1, 2}, be discrete variables indicating if machine i is idle (Yi = 0), is

processing an item (Yi = 1) or still has a remaining time of maintenance (Yi< 0). Furthermore,

we need to keep track of the number of items in the buffer denoted by B and the current inventory position denoted by I.

We formulate a MDP with a finite state space, as this allows us to use the value iteration algorithm presented in Section 4.3. The state space of the MDP can be represented by

S = {(X1, X2, Y1, Y2, B, I) : X1 ∈ {1, ..., m, m + 1}, X2 ∈ {1, ..., m, m + 1}, Y1 ∈ {−Tcm, ..., −Tpm, ...., 0, 1}, Y2 ∈ {−Tcm, ..., −Tpm, ...., 0, 1}, B ∈ {0, ..., Bmax}, I ∈ {−w, ..., Imax}}.

An element s ∈ S thus has the form s = (X1, X2, Y1, Y2, B, I).

4.2 Preparation for Value Iteration Algorithm

The value iteration algorithm generates a sequence of functions v0, v1, v2, ... ∈ V , where V denotes the space of bounded real-valued functions on the state space S. Hence, v : S → R for all v ∈ V . The value vt(s) can be interpreted as the total expected cost when in state s ∈ S with t periods left. The value iteration algorithm iteratively determines the functions in V , where vt is a function of vt−1 for all t ∈ N.

(11)

1.1. Decide action for machine 1. 1.2. Decide action for machine 2.

2.1. Observe if processing on machine 1 is successful. 2.2. Observe if processing on machine 2 is successful. 3.1. Observe deterioration on machine 1.

3.2. Observe deterioration on machine 2. 4. Observe demand.

4.2.1 Phases 1.1 and 1.2

In phase 1.1 we decide which action to perform on machine 1 and in phase 1.2 we decide which action to perform on machine 2. When in phase 1.1, in most scenarios, we can choose to leave machine 1 idle (N ), maintain machine 1 (M ), or process an item on machine 1 (J ). There are, however, a few exceptions. When the machine is currently not idle, initiating maintenance on machine 1 is not possible and so is processing an item. Furthermore, as we consider an infinite time horizon and no economic dependence between the machines, if machine 1 is in the failed state it is always optimal to initiate corrective maintenance as quickly as possible. Thus, if X1 = m + 1, given that the machine is idle, we can safely assume that the only eligible action is

to initiate maintenance. Lastly, we are bounded by the capacity of the buffer and thus cannot process an item if the buffer is full. The action space as a function of the current state s ∈ S for this phase is given by

A1(s) =                      {N } if Y16= 0, {M } if Y1= 0 and X1 = m + 1, {N, M } if Y1= 0, X1 < m + 1 and B = Bmax, {N, M, J } otherwise.

The direct cost corresponding to choosing action a1 ∈ A1(s) in the current state s ∈ S is given

by C1.1(s, a1) =      0 if a1 = N or a1 = J, cpm+ (ccm− cpm)1X1=m+1 if a1 = M.

Note that 1Y is the indicator function equal to one if statement Y holds and 0 otherwise.

(12)

be incorporated.

This phase does not include any stochastic elements. We will observe the stochastic effects of the actions in the upcoming phases. The state changes depending on the chosen action. When maintenance is started, the remaining time of maintenance is set to Tcm if the machine is in the

failed state and to Tpm otherwise. As a modeling choice, we set the deterioration state back

to the as-good-as-new state at the start of maintenance. If we choose to process an item on machine 1, the only state element changing is Y1 which is set to one. Let f1.1(s, a1) denote the

state at the end of phase 1.1 when action a1 ∈ A1(s) is chosen in state s ∈ S. We have

f1.1(s, a1) =                    s if a1 = N, (1, X2, −Tcm, Y2, B, I) if a1 = M and X1 = m + 1, (1, X2, −Tpm, Y2, B, I) if a1 = M and X1 < m + 1, (X1, X2, 1, Y2, B, I) if a1 = J.

For phase 1.2 a similar logic applies but now for machine 2. We can choose between leaving the machine idle (N ), initiating maintenance (M ) or processing an item on machine 2 (J ). We cannot start maintenance or process an item on machine 2 if the machine is currently not idle. If machine 2 is in the failed state and idle, the only eligible action is to start maintenance on machine 2. Furthermore, we can also not process an item if the buffer is empty or the inventory is full. The action space for phase 1.2 as a function of system state s ∈ S is then given by

A2(s) =                      {N } if Y26= 0, {M } if Y2= 0 and X2 = m + 1, {N, M } if Y2= 0, X2 < m + 1 and (B = 0 or I = Imax), {N, M, J } otherwise.

Furthermore, for s ∈ S and a2 ∈ A2(s), the direct cost function is given by

(13)

and the new state by f1.2(s, a2) =                    s if a2 = N, (X1, 1, Y1, −Tcm, B, I) if a2 = M and X2 = m + 1, (X1, 1, Y1, −Tpm, B, I) if a2 = M and X2 < m + 1, (X1, X2, Y1, 1, B, I) if a2 = J. 4.2.2 Phases 2.1 and 2.2

If in phase 1.1 we choose to process an item on machine 1, in phase 2.1 we observe if it is successful. Otherwise, the state does not change during this phase. Let p2.1(s0|s) denote the

probability of moving to state s0 from state s during this phase. When processing on machine 1, with probability q1 this will be successful and then the buffer quantity (B) increases by one.

With probability 1 − q1, the processing is not successful and B remains the same. This leads to

the following transition probabilities:

p2.1(s0|s) =                      1 if Y1 6= 1 and s0 = s, q1 if Y1 = 1 and s0 = (X1, X2, Y1, Y2, B + 1, I), 1 − q1 if Y1 = 1 and s0 = s, 0 otherwise.

Let p2.2(s0|s) denote the probability of moving to state s0 from state s during phase 2.2. If

no processing is taking place on machine 2, the state remains the same. When processing on machine 2, with probability q2 the buffer quantity decreases by one while the inventory position

increases by one. Recall that we assume that whenever the processing of an item on machine 2 is not successful, this item is placed back into the buffer. With probability 1 − q2, the item is

not processed successfully and as a result B and I remain the same.

(14)

4.2.3 Phases 3.1 and 3.2

In phases 3.1 and 3.2 we account for the deterioration effects of processing items on the machines. Starting with phase 3.1, when machine 1 is active (Y1 = 1), deterioration takes place at that

machine. The exact deterioration depends on the transition probability matrix P1. Let P1[k, l]

denote the [k, l]-th element of P1. Given the current deterioration state X1, the new deterioration

state for machine 1 is X₁0 with probability P1[X1, X10]. If we are processing on machine 1 at the

end of this phase machine 1 will be idle and thus Y₁0 = 0. Lastly, when machine 1 is currently being maintained (Y1 < 0), the remaining maintenance time of the machine is decreased by one.

The probability of moving from state s0 to state s during this phase is given by

p3.1(s0|s) =                      1 if Y1< 0 and s0 = (X1, X2, Y1+ 1, Y2, B, I), 1 if Y1= 0 and s0 = s, P1[X1, X10] if Y1= 1 and s0 = (X10, X2, 0, Y2, B, I), 0 otherwise.

In phase 3.2 we follow an analogous reasoning for the deterioration on machine 2 which leads to the transition probabilities p3.2(s0|s). We have

p3.2(s0|s) =                      1 if Y2< 0 and s0 = (X1, X2, Y1, Y2+ 1, B, I), 1 if Y2= 0 and s0 = s, P2[X2, X20] if Y2= 1 and s0 = (X1, X20, Y1, 0, B, I), 0 otherwise. 4.2.4 Phase 4

The last phase consists of observing demand. Let p4(s0|s) denote the probability of moving

from state s to state s0 during phase 4. Recall that D follows a discrete distribution with finite support and maximum value ¯D. There is a probability P[D = j] that the demand in this time period is of size j. We can only satisfy the demand when it is smaller than I + w. If the demand is larger than this, we assume that the inventory decreases by I + w and demand is then only partially satisfied.

To incorporate the cost of the lost orders we make use of the expected number of lost orders. The excepted number of lost orders is given by E max{D − (I + w), 0}; equivalently, E(D − (I + w))+. As bL is the cost per lost ordered item, the total cost of lost orders when in

(15)

Cl(s) = bL· E(D − (I + w))+.

These costs are incurred before observing the realized demand.

The probability of moving from state s to state s0 during phase 4 is subsequently given by

p4(s0|s) =      P[D = j] if s0 = (X1, X2, Y1, Y2, B, max{I − j, −w}) and j ∈0, ..., ¯D 0 otherwise.

As mentioned earlier in Section 3, we also have a direct cost depending on the buffer and inventory position at the end of the time period. Let C4(s) denote this direct cost when in

system state s ∈ S. Recall that hI is the holding cost per item in the inventory, bI the shortage

cost incurred per item and hB the holding cost for an item in the buffer. Then,

C4(s) = hI· max{I, 0} + bI· max{−I, 0} + hB· B.

4.3 Value Iteration Algorithm

The value iteration algorithm is a method that can be used to find a stationary -optimal policy when dealing with infinite horizon MDPs. As mentioned before, the value iteration algorithm iteratively determines the elements in V , where vt is a function of vt−1 for all t ∈ N. We set v0_{(s) = 0 for all s ∈ S. The values determined by the algorithm provide a lower and upper}

bound on the minimum average cost. This lower and upper bound are

min

s∈S{v

n_{(s) − v}n−1_{(s)} and max} s∈S{v

n_{(s) − v}n−1_(s)},

respectively. Upon termination of the algorithm, the difference between these bounds and the average cost is at most . In the remainder of this paper we will work with an approximation of the average cost given by the average of the lower and upper bound,

1 2 min s∈S{v n_{(s) − v}n−1_{(s)} + max} s∈S{v n_{(s) − v}n−1_(s)} .

As we are dealing with an infinite horizon MDP, we consider stationary policies, that is, policies that use decision rule d every time period. Such a decision rule is a function d : S → A(s), and specifies the action that is chosen in state s ∈ S. Let the span of v be denoted by sp(v) and defined as

sp(v) = max

s∈S v(s) − mins∈S v(s).

(16)

can be described as follows: 1. Set v0(s) = 0 for all s ∈ S.

2. For each s ∈ S, vtcan be computed as a function of vt−1for all t ∈ N by using the phases from Section 4.2 backwards. More precisely, compute vt(s) by solving:

ut₄(s) = vt−1(s) + C4(s), ut_3.2(s) = X s0_∈S p4(s0|s)ut4(s 0 ) + Cl(s), ut_3.1(s) = X s0_∈S p3.2(s0|s)ut3.2(s0), ut_2.2(s) = X s0_∈S p3.1(s0|s)ut3.1(s0), ut_2.1(s) = X s0_∈S p2.2(s0|s)ut2.2(s0), ut_1.2(s) = X s0_∈S p2.1(s0|s)ut2.1(s 0 ), ut_1.1(s) = min a2∈A2(s) ut 1.2(f1.2(s, a2)) + C1.2(s, a2) , vt(s) = min a1∈A1(s) ut 1.1(f1.1(s, a1)) + C1.1(s, a1) . 3. If sp(vt− vt−1) < ,

go to step 4. Otherwise, increase t by one and return to step 2. 4. For each s ∈ S, choose

d2(s) ∈ arg min a2∈A2(s) ut 1.2(f1.2(s, a2)) + C1.2(s, a2) , and d1(s) ∈ arg min a1∈A1(s) ut 1.1(f1.1(s, a1)) + C1.1(s, a1) ,

where dl(s) denotes the decision rule concerning machine l, l ∈ {1, 2} for all s ∈ S.

5 Stochastic Maintenance Durations

(17)

maintenance is being performed on machine i, , i ∈ {1, 2}, it moves to the as-good-as-new state in the current time period with probability rpm,i when it concerns preventive maintenance and

with probability rcm,i when performing corrective maintenance.

5.1 The State Space

As Section 4 is very alike, analogous notation will be used. Hence, S represents the state space, Xithe deterioration state of machine i, Yithe current occupation state of machine i, B the buffer,

and I the current inventory position, i ∈ {1, 2}. We set Yi = −2 if machine i is currently under

corrective maintenance and Yi = −1 if the machine i is currently under preventive maintenance.

Furthermore, Yi = 0 indicates an idle machine, and Yi = 1 implies machine i is processing

an item. Reintroducing the buffer capacity Bmax, the maximum number of backorders w and

inventory capacity Imax from Section 4, we get the following state space:

S = {(X1, X2, Y1, Y2, B, I) : X1 ∈ {1, ..., m, m + 1}, X2 ∈ {1, ..., m, m + 1}, Y1 ∈ {−2, −1, 0, 1}, Y2 ∈ {−2, −1, 0, 1}, B ∈ {0, ..., Bmax}, I ∈ {−w, ..., Imax}}.

5.2 Preparation for Value Iteration Algorithm

Analogous to Section 4, the transition from vt_{to v}t−1 _{is split into different phases. As we now}

have uncertainty when performing maintenance, two additional phases are added in which we observe whether maintenance is finished. We have the following phases:

1.1. Decide action for machine 1. 1.2. Decide action for machine 2.

2.1. Observe if maintenance on machine 1 is finished. 2.2. Observe if maintenance on machine 2 is finished. 3.1. Observe if processing on machine 1 is successful. 3.2. Observe if processing on machine 2 is successful. 4.1. Observe deterioration on machine 1.

(18)

5.2.1 Phases 1.1 and 1.2

In these phases we decide which actions to perform on both machines. We start with phase 1.1 and thus with the action to be performed on machine 1. Let N indicate leaving the machine idle, M denote initiating maintenance, and J represent processing an item on machine. We cannot initiate processing or maintenance on machine 1 when it is not idle. Maintenance must be initiated when the machine is in the failed state and currently idle. In addition, machine 1 cannot process an item when the buffer is full. In all other situations, we can choose to initiate maintenance, process an item or leave the machine idle. This leads to the following action space as a function of the current state s ∈ S:

A1(s) =                      {N } if Y16= 0, {M } if Y1= 0 and X1 = m + 1, {N, M } if Y1= 0, X1 < m + 1 and B = Bmax, {N, M, J } otherwise.

The direct cost corresponding to action a1 ∈ A1(s) when in state s ∈ S is given by

C1.1(s, a1) =      0 if a1 = N, cpm+ (ccm− cpm)1X1=m+1 if a1 = M.

This phase does not include any stochastic elements. We will observe the stochastic effects of the actions in upcoming phases. The change of the state depends on the chosen action. When maintenance is started, Y1 is set to -2 or -1 depending on the type of maintenance. The

deterioration state is set back to the as-good-as-new state at the start of maintenance. When choosing to process an item on machine 1, Y1is set to one. Let f1.1(s, a1) denote the intermediate

state at the end of phase 1.1 when action a1 ∈ A1(s) is chosen in state s ∈ S. We have

f1.1(s, a1) =                    s if a1= N, (1, X2, −2, Y2, B, I) if a1= M and X1 = m + 1, (1, X2, −1, Y2, B, I) if a1= M and X1 < m + 1, (X1, X2, 1, Y2, B, I) if a1= J.

(19)

machine is currently not idle. Maintenance must be started when the machine is in the failed state and currently idle. Furthermore, an item can only be processed if the machine is idle, the buffer is non-empty, and the inventory not full. The action space for phase 1.2 and s ∈ S is then given by A2(s) =                      {N } if Y26= 0, {M } if Y2= 0 and X2 = m + 1, {N, M } if Y2= 0, X2 < m + 1 and (B = 0 or I = Imax), {N, M, J } otherwise.

Furthermore, for s ∈ S and a2 ∈ A2(s), the direct cost function is given by

C1.2(s, a2) =      0 if a2 = N or a2 = J, cpm+ (ccm− cpm)1X2=m+1 if a2 = M,

and the new state by

f1.2(s, a2) =                    s if a2= N, (X1, 1, Y1, −2, B, I) if a2= M and X2 = m + 1, (X1, 1, Y1, −1, B, I) if a2= M and X2 < m + 1, (X1, X2, Y1, 1, B, I) if a2= J. 5.2.2 Phases 2.1 and 2.2

In the newly designed phase 2.1, we observe if any initiated maintenance on machine 1 finishes this time period. If no maintenance is performed on machine 1 during this time period, the state remains the same with probability one. If maintenance is being performed and it is preventive, with probability rpm, the maintenance finishes this time period and the machine is idle again.

With probability 1 − rpm, the preventive maintenance is not done yet and the state does not

change. Note that Y1 thus will remain unchanged and maintenance continues in the next time

(20)

of moving from state s to state s0 during phase 2.1: p2.1(s0|s) =                                        1 if Y1 ≥ 0 and s0 = s, rpm if Y1 = −1 and s0 = (X1, X2, 0, Y2, B, I), 1 − rpm if Y1 = −1 and s0 = s, rcm if Y1 = −2 and s0 = (X1, X2, 0, Y2, B, I), 1 − rcm if Y1 = −2 and s0 = s, 0 otherwise.

In phase 2.2 we have the same procedure as in phase 2.1 but now for machine 2. So,

p2.2(s0|s) =                                        1 if Y2 ≥ 0 and s0 = s, rpm if Y2 = −1 and s0 = (X1, X2, Y1, 0, B, I), 1 − rpm if Y2 = −1 and s0 = s, rcm if Y2 = −2 and s0 = (X1, X2, Y1, 0, B, I), 1 − rcm if Y2 = −2 and s0 = s, 0 otherwise. 5.2.3 Phases 3.1 and 3.2

In phases 3.1 and 3.2 we incorporate the processing success probabilities. Recall that when processing on machine i, i ∈ {1, 2}, there is a probability of qi that processing is successful.

When processing on machine 1 is successful, the number of items in the buffer increases by one. For phase 3.1 this leads to the following transition probabilities:

p3.1(s0|s) =                      1 if Y1 6= 1 and s0 = s, q1 if Y1 = 1 and s0 = (X1, X2, Y1, Y2, B + 1, I), 1 − q1 if Y1 = 1 and s0 = s, 0 otherwise.

(21)

are given by p3.2(s0|s). We have p3.2(s0|s) =                      1 if Y2 6= 1 and s0 = s, q2 if Y2 = 1 and s0 = (X1, X2, Y1, Y2, B − 1, I + 1), 1 − q2 if Y2 = 1 and s0 = s, 0 otherwise. 5.2.4 Phases 4.1 and 4.2

Just as with deterministic maintenance durations, we observe the deterioration that occurs when the machines are used for processing. In phase 4.1 this is done for machine 1. If processing on machine 1, the machine deteriorates according to deterioration matrix P1. Furthermore, at the

end of this phase, the machine will be denoted idle again. The probability of transiting from state s to state s0 during phase 4.1 is given by p4.1(s0|s). Then,

p4.1(s0|s) =              1 if Y16= 1 and s0 = s, P1[X1, X10] if Y1= 1 and s0 = (X10, X2, 0, Y2, B, I), 0 otherwise.

Analogously, in phase 4.2 we observe the deterioration on machine 2. Let p4.2(s0|s) denote

the probability of going from state s to state s0 during phase 4.2. We have

p4.2(s0|s) =              1 if Y26= 1 and s0 = s, P2[X2, X20] if Y2= 1 and s0 = (X1, X20, Y1, 0, B, I), 0 otherwise. 5.2.5 Phase 5

The last phase of the MDP with stochastic maintenance durations is exactly the same as the last phase from Section 4.2. There is a probability P[D = j] that a new order of size j arrives, where D follows a discrete distribution with finite support. As bL is the cost per lost ordered

item, the total cost of lost orders in state s ∈ S is

(22)

The probability of moving from s to s0 during this phase, denoted by p5(s0|s), is given by p5(s0|s) =      P[D = j] if s0 = (X1, X2, Y1, Y2, B, max{I − j, −w}) and j ∈0, ..., ¯D , 0 otherwise.

Lastly, the total cost corresponding to the buffer and inventory when in state s ∈ S is C5(s) = hI· max{I, 0} + bI· max{−I, 0} + hB· B,

where hI, bI and hB are the inventory holding cost per item, the shortage cost per item and the

buffer holding cost per item, respectively.

5.3 Value Iteration Algorithm

This section contains an adjusted version of the value iteration algorithm of Section 4.3 applicable for the model with stochastic maintenance durations.

1. Set v0(s) = 0 for all s ∈ S.

2. For each s ∈ S, vtcan be computed as a function of vt−1for all t ∈ N by using the phases from Section 5.2 backwards. More precisely, compute vt(s) by solving:

(23)

3. If

sp(vt− vt−1_{) < ,}

go to step 4. Otherwise, increase t by one and return to step 2. 4. For each s ∈ S, choose

d2(s) ∈ arg min a2∈A2(s) ut 1.2(f1.2(s, a2)) + C1.2(s, a2) , and d1(s) ∈ arg min a1∈A1(s) ut 1.1(f1.1(s, a1)) + C1.1(s, a1) ,

where dl(s) denotes the decision rule concerning machine l, l ∈ {1, 2} for all s ∈ S.

6 Structural Properties

In this section, we will present structural properties of the optimal policy with stochastic main-tenance durations. Hence, we consider the model from Section 5. Similar properties will hold with deterministic maintenance durations. The main result will be that maintenance is carried out according to a control-limit policy.

In order to prove the upcoming results, we require Assumption 6.1 and Lemma 6.1: Assumption 6.1 P1and P2 have an increasing failure rate, that is, for k ∈ {1, 2},Pm+1_j=h Pk[i, j]

is non-decreasing in i for any fixed h ∈ {1, ..., m, m + 1}.

Lemma 6.1 Let xi, x0_i∈ R+, i ∈ {1, ..., m} be such that m X j=l xj ≥ m X j=l x0_j

for all l. Let yi∈ R be a non-decreasing sequence, then m X j=l xjyj ≥ m X j=l x0_jyj.

Proof. The proof follows from the proof of Lemma 4.7.2 of Puterman (1984) by setting xj =

x0_j = 0 for j > m and choosing yj arbitrarily for j > m.

We will proof the upcoming lemma using mathematical induction on the steps of the value iteration algorithm. Let v_at₁(s) denote the cost-to-go when in state s ∈ S in phase 1.1 of iteration t of the value iteration algorithm of choosing a1 ∈ A1(s) and the optimal actions thereafter.

(24)

of the value iteration algorithm of choosing a2 ∈ A2(s) and the optimal actions thereafter.

Lemma 6.2 gives the intuitive result that the cost-to-go v_Nt and v_Jt are non-decreasing in the deterioration state of machine 1.

Lemma 6.2 For any t ∈ N and any (X2, Y1, Y2, B, I), we have that vNt and vJt are non-decreasing

in X1.

Proof. We implicitly assume all actions are eligible in phases 1.1 and 1.2. If not, the proof is still correct in adjusted form. As mentioned before, we will use mathematical induction.

The goal is to show that vt(s) is non-decreasing in X1 for all s ∈ S. This will immediately

yield the desired result. Note that v0(s) is non-decreasing in X1 as it is assumed that v0(s) = 0

for all s ∈ S. As part of the mathematical induction, we assume that vt−1(s) is non-decreasing in X1 for all s ∈ S.

We will follow the steps of the value iteration algorithm. As a start, note that ut₅(s) = vt−1(s) + C5(s), ut_4.2(s) = ¯ D X j=0 PD = jut5(X1, X2, Y1, Y2, B, max{I − j, −w}) + Cl(s), and ut_4.1(s) =      ut 4.2(s) if Y26= 1, Pm+1 x=X2P2[X2, x]u t 4.2(X1, x, Y1, 0, B, I) if Y2= 1.

The functions C5(s) and Cl(s) are constant in X1 for all s ∈ S. By the induction assumption,

vt−1(s) is non-decreasing in X1 for all s ∈ S. We then have that ut5(s) is the sum of

non-decreasing functions and thus non-non-decreasing in X1 itself for all s ∈ S. Consequently, the first

term in the formula of ut_4.2(s) is a weighted sum of non-decreasing functions. As a result, ut_4.2(s) is non-decreasing in X1 for all s ∈ S. Again applying the property that a weighted sum of

non-decreasing functions is non-decreasing, we find that ut_4.1(s) is non-decreasing in X1 for all

s ∈ S.

For the next phase we have that

(25)

By Assumption 6.1, Lemma 6.1 and the induction assumption, for X₁0 > X1, m+1 X x=X1 P1[X1, x]ut4.1(x, X2, 0, Y2, B, I) ≤ m+1 X x=X0 1 P1[X10, x]ut4.1(x, X2, 0, Y2, B, I). Hence, Pm+1 x=X1P1[X1, x]u t 4.1(x, X2, Y1, Y2, B, I) is non-decreasing in X1. Therefore, ut3.2(s) is

non-decreasing in X1 for all s ∈ S.

We now move to the phases 2.1 to 3.2 concerned with observing the outcome of processing and maintenance. We have that

ut_3.1(s) =      ut_3.2(s) if Y2 6= 1, q2· ut3.2(X1, X2, Y1, Y2, B − 1, I + 1) + (1 − q2) · ut3.2(s) if Y2 = 1, ut_2.2(s) =      ut_3.1(s) if Y16= 1, q1· ut3.1(X1, X2, Y1, Y2, B + 1, I) + (1 − q1) · ut3.1(s) if Y1= 1, ut_2.1(s) =              ut_2.2(s) if Y2 ≥ 0, rpm· ut2.2(X1, X2, Y1, 0, B, I) + (1 − rpm) · ut2.2(s) if Y2 = −1, rcm· ut2.2(X1, X2, Y1, 0, B, I) + (1 − rcm) · ut2.2(s) if Y2 = −2, and ut_1.2(s) =              ut_2.1(s) if Y1 ≥ 0, rpm· ut2.1(X1, X2, 0, Y2, B, I) + (1 − rpm) · ut2.1(s) if Y1 = −1, rcm· ut2.1(X1, X2, 0, Y2, B, I) + (1 − rcm) · ut2.1(s) if Y1 = −2.

By repeatedly applying the rule that a weighted summation of decreasing functions is non-decreasing, we have that ut_3.1(s), ut_2.2(s), ut_2.1(s) and ut_1.2(s) are all non-decreasing in X1 for all

s ∈ S.

(26)

= min ut_1.1(s), ut_1.1(1, X2, −(1 +1X1=m+1), Y2, B, I), u t 1.1(X1, X2, 1, Y2, B, I) . As the minimum of non-decreasing functions is non-decreasing, we have that ut_1.1(s) is non-decreasing in X1 for all s ∈ S. Then, vt(s) is non-decreasing in X1 for all s ∈ S and the desired

result is reached.

Lemma 6.3 shows that, when performing preventive maintenance, the cost-to-go v_Mt (s) does not depend on the deterioration state of machine 1.

Lemma 6.3 For any t ∈ N and any (X2, Y1, Y2, B, I), we have that vt_M is constant in X1 ∈

{1, ..., m}.

Proof. For simplicity, assume that action M is eligible in phase 1.1. We have v_Mt (s) = ut_1.1(1, X2, −1, Y2, B, I) + cpm,

for X1 ∈ {1, ..., m}. Clearly, ut1.1(1, X2, −1, Y2, B, I) is constant in X1 for all s ∈ S. Hence,

v_Mt (s) is constant in X1 for all s ∈ S \ {X1= m + 1}.

In Lemma 6.4, we combine the previous results and prove the existence of a control-limit policy.

Lemma 6.4 Whenever machine 1 is idle, for each combination of (X2, Y2, B, I) there exists

a threshold X_(XJ

2,Y2,B,I) ∈ {1, ..., m} such that for X1 > X

J

(X2,Y2,B,I) it is optimal to choose

a1 = M over a1 = J . Similarly, there exists a threshold X_(XN₂_,Y₂_,B,I) ∈ {1, ..., m} such that for

X1 > X_(XN₂_,Y₂_,B,I) it is optimal to choose a1 = M over a1= N .

Proof. We have that initiating maintenance in the failed state is mandatory. Combining this observation, Lemma 6.2 and Lemma 6.3 yields the result.

Lemma 6.4 has the following consequence: if initiating maintenance is optimal for any state (X1, X2, Y1, Y2, B, I), it is also optimal for state (X10, X2, Y1, Y2, B, I) with X10 > X1.

Similar results turn out to hold for machine 2 in the form of Lemma 6.5, Lemma 6.6 and Lemma 6.7. The proofs are omitted due to the great resemblance with the lemmas corresponding to machine 1.

Lemma 6.5 For any t ∈ N and any (X1, Y1, Y2, B, I), we have that ut_1.1,N and ut_1.1,J are

non-decreasing in X2.

Lemma 6.6 For any t ∈ N and any (X1, Y1, Y2, B, I), we have that ut1.1,M is constant in

(27)

Lemma 6.7 Whenever machine 2 is idle, for each combination of (X1, Y1, B, I) there exists

a threshold X_(XJ

1,Y1,B,I) ∈ {1, ..., m} such that for X2 > X

J

(X2,Y2,B,I) it is optimal to choose

a2 = M over a2 = J . Similarly, there exists a threshold X_(XN₁_,Y₁_,B,I) ∈ {1, ..., m} such that for

X2 > X_(XN₁_,Y₁_,B,I) it is optimal to choose a2 = M over a2= N .

7 Numerical Analysis

In this section, the MDPs from Section 4 and 5 will be used to determine optimal decisions. The results will provide insights in the trade-off between production and maintenance. In Section 7.1 we will examine the optimal decisions for a general base case with deterministic maintenance durations. Section 7.2 contains a sensitivity analysis in order to investigate the influence of having stochastic instead of deterministic maintenance durations. Furthermore, we study the effect of parameter changes in this analysis.

In all of our analyses, we use a discretized stationary gamma process to model the dete-rioration of the machines. The gamma process is based on the gamma distribution with shape parameter a > 0, scale parameter b > 0 and density function

fa,b(x) = 1 Γ(a)bax a−1_ex_b_, _{x > 0,} where Γ(a) = R∞ 0 z

a−1_e−z_{dz denotes the gamma function. In order to discretize the gamma}

process, we use the approach described in De Jonge (2019). We refer to the Appendix for more information on the applied method. Both machines deteriorate according to the same transition probability matrix P . Hence, P = P1= P2. The parameters of the gamma process are a = 0.1

and b = 0.7. Furthermore, we assume a machine fails if the deterioration level exceeds the fixed threshold L = 1. The length of the time steps is set equal to one, i.e., ∆t = 1. Identifying m = 3 deterioration states before failure results in the following deterioration matrix:

P ≈         0.867 0.091 0.022 0.018 0 0.867 0.091 0.041 0 0 0.867 0.131 0 0 0 1         .

The cost of preventive maintenance equals cpm= 4 and corrective maintenance costs ccm =

5. Each time period, the holding cost per item is hI = 1 for the inventory and hB = 1 for the

(28)

When considering deterministic maintenance durations, carrying out preventive mainte-nance takes Tpm = 2 time periods and corrective maintenance Tcm = 3 time periods. With

stochastic maintenance durations, the probability that preventive maintenance is finished in the current time period is rpm= 1₂. The probability that corrective maintenance is finished is

rcm = 1₃. Note that the expected time until maintenance is finished after initiation does not

differ between deterministic maintenance durations and stochastic maintenance durations. For all instances, we use the value iteration algorithm with = 10−3.

7.1 Interpreting the optimal decisions for a base case

In this section, we consider deterministic maintenance durations. The buffer and inventory capacity are Bmax = 3 and Imax= 4, respectively. The maximum number of backorders w = 8

is set relatively high to make its effect negligible. Hence, we examine the situation in which lost orders occur rarely. The processing success probabilities for the two machines are q1= 0.85

and q2 = 0.9. Furthermore, we have the following demand distribution: PD = 0 = 0.7,

PD = 1 = 0.2, and PD = 2 = 0.1.

Table 1 presents optimal decisions for a selection of states in which both machines are idle (Y1 = Y2 = 0). First, we consider the optimal decisions for both machines when machine 2 is

in a low deterioration state, i.e., X2 = 1 or X2 = 2. Starting with the optimal decisions for

machine 1, we observe that when machine 1 is not in the failed state (X1 ≤ 3), it is in most

situations optimal to process on machine 1. The exception is when the sum of items in the buffer and the inventory is relatively large. Then, there is no urgent need for additional items to reach the buffer and the choice is between leaving the machine idle or initiate preventive maintenance. It appears to be optimal to leave the machine idle when machine 1 is in the first or second deterioration state (X1= 1 or X1 = 2) as the machine is in a good enough condition

to process more items. When in deterioration state 3 (X1 = 3), the fact that no additional

items need to reach the buffer is used as an opportunity to perform preventive maintenance on machine 1. When machine 1 has failed (X1 = 4), it has to be maintained.

(29)

buffer beforehand. At a lower deterioration level, the additional item in the buffer is not worth putting the valuable position of being in a low deterioration state at risk.

The optimal strategy for machine 2 when it is at the first or second deterioration level (X2 = 1 or X2 = 2) is to process an item when the buffer is non-empty. If the buffer is empty,

we will leave the machine idle except when machine 1 is in the failed state and the inventory position is below 2. As maintenance is initiated on machine 1 and the buffer is empty, no processing can take place on machine 2 in the upcoming periods. This is used as an opportunity to perform preventive maintenance on machine 2 when it is at deterioration level 2. When the inventory position is 2 or larger, there are enough items in the inventory to satisfy the demand in the upcoming periods and preventive maintenance on machine 2 is postponed.

Table 1: Optimal decisions assuming both machines are idle for different combinations of (X1, X2, B, I). X2= 1 X2= 2 B \ I -1 0 1 2 -1 0 1 2 X1= 1 0 (J, N ) (J, N ) (J, N ) (J, N ) (J, N ) (J, N ) (J, N ) (J, N ) 1 (J, J ) (J, J ) (J, J ) (N, J ) (J, J ) (J, J ) (J, J ) (N, J ) 2 (J, J ) (N, J ) (N, J ) (N, J ) (J, J ) (N, J ) (N, J ) (N, J ) X1= 2 0 (J, N ) (J, N ) (J, N ) (J, N ) (J, N ) (J, N ) (J, N ) (J, N ) 1 (J, J ) (J, J ) (J, J ) (N, J ) (J, J ) (J, J ) (J, J ) (N, J ) 2 (J, J ) (N, J ) (N, J ) (N, J ) (J, J ) (N, J ) (N, J ) (N, J ) X1= 3 0 (J, N ) (J, N ) (J, N ) (J, N ) (J, N ) (J, N ) (J, N ) (J, N ) 1 (J, J ) (J, J ) (J, J ) (M, J ) (J, J ) (J, J ) (J, J ) (M, J ) 2 (J, J ) (J, J ) (M, J ) (M, J ) (J, J ) (J, J ) (M, J ) (M, J ) X1= 4 0 (M, N ) (M, N ) (M, N ) (M, N ) (M, M ) (M, M ) (M, M ) (M, N ) 1 (M, J ) (M, J ) (M, J ) (M, J ) (M, J ) (M, J ) (M, J ) (M, J ) 2 (M, J ) (M, J ) (M, J ) (M, J ) (M, J ) (M, J ) (M, J ) (M, J ) X2= 3 X2= 4 B \ I -1 0 1 2 -1 0 1 2 X1= 1 0 (J, M ) (J, M ) (J, M ) (N, M ) (J, M ) (J, M ) (N, M ) (N, M ) 1 (J, J ) (J, J ) (J, J ) (N, J ) (N, M ) (N, M ) (N, M ) (N, M ) 2 (J, J ) (N, J ) (N, J ) (N, J ) (N, M ) (N, M ) (N, M ) (N, M ) X1= 2 0 (J, N ) (J, M ) (J, M ) (J, M ) (J, M ) (J, M ) (J, M ) (N, M ) 1 (J, J ) (J, J ) (J, J ) (N, J ) (J, M ) (M, M ) (M, M ) (N, M ) 2 (J, J ) (N, J ) (N, J ) (N, J ) (M, M ) (M, M ) (M, M ) (N, M ) X1= 3 0 (J, N ) (J, N ) (J, N ) (J, N ) (J, M ) (J, M ) (M, M ) (M, M ) 1 (J, J ) (J, J ) (J, J ) (M, J ) (M, M ) (M, M ) (M, M ) (M, M ) 2 (J, J ) (J, J ) (M, J ) (M, J ) (M, M ) (M, M ) (M, M ) (M, M ) X1= 4 0 (M, M ) (M, M ) (M, M ) (M, M ) (M, M ) (M, M ) (M, M ) (M, M ) 1 (M, J ) (M, J ) (M, J ) (M, J ) (M, M ) (M, M ) (M, M ) (M, M ) 2 (M, J ) (M, J ) (M, J ) (M, J ) (M, M ) (M, M ) (M, M ) (M, M )

Next, consider machine 2 being in deterioration state 3 (X2 = 3). Concerning machine

(30)

idle when at or below the second deterioration level and maintenance is initiated on machine 1 when at deterioration level 3. When machine 1 is in the failed state, the only eligible action is to initiate maintenance.

Also when machine 2 is at deterioration state 3 (X2 = 3), it is optimal to process an item

on machine 2 when there is a positive amount of items in the buffer. If the buffer is empty, the choice is between leaving machine 2 idle and initiating preventive maintenance on machine 2. When machine 1 is in the failed state, we use this time of no items reaching the buffer to initiate preventive maintenance on machine 2. When machine 1 is not in the failed state, we observe that the optimal decision depends on the deterioration state of machine 1 and the inventory position. It is optimal to initiate maintenance instead of leaving the machine idle when the inventory position is above a certain threshold. The inventory position is important because as the amount of inventory increases, there is less risk of running out of inventory during maintenance. Hence, the relative value of initiating maintenance instead of leaving the machine idle increases in the inventory position. Furthermore, the threshold increases in the deterioration state of machine 1, i.e., X1. This has the following explanation. In the optimum scenario,

maintenance on both machines is performed simultaneously. When machine 1 is processing and in a higher deterioration state, it is likely that it will reach failed state in the upcoming period and needs to be maintained. Depending on the inventory position, it may then be optimal to apply a wait-and-see tactic for machine 2 and thus be more reluctant in initiating maintenance. When machine 1 has a lower deterioration level, the probability of an upcoming failure for this machine is too low to justify waiting. Then, preventive maintenance on machine 2 will be initiated immediately.

Lastly, when machine 2 is in the failed state (X2 = 4), starting corrective maintenance on

machine 2 is compulsory. The fact that machine 2 will be unavailable in the upcoming periods creates an incentive to also initiate maintenance on machine 1. If machine 1 is the in the lowest deterioration state (X1= 1), it is optimal to process an item when the buffer is empty and the

(31)

When at deterioration level 3, chances are high the machine will fail during processing and the unavailability of machine 2 is used as an opportunity to perform maintenance on machine 1 in almost all scenarios.

Concluding, we observe that the optimal decision for a machine not only depends on its own deterioration state but also on the deterioration state of the other machine. Furthermore, the buffer and inventory position also have a significant impact. The willingness to process an item instead of leaving the machine idle may be higher when the machine is in a higher deterioration state due to the fact that maintenance then is soon due. Additionally, when considering initiating preventive maintenance, it is sometimes optimal to deploy a wait-and-see tactic to observe the outcomes of the stochastic processes. Lastly, we notice that it is never optimal to initiate preventive maintenance on machine 2 when there is a positive amount of items in the buffer. This is caused by the fact that holding items in the buffer and the inventory costs the same (hb = hi) and due to the uncertainty introduced by the processing success probabilities

it is more valuable to have an additional item in the inventory than in the buffer.

7.2 Sensitivity Analysis

In this section, we observe what changes to the optimal solution in case of stochastic maintenance durations instead of deterministic maintenance durations. Furthermore, we will investigate the effects of changes in the parameter values.

In our first comparison, we investigate what happens when we shift from deterministic to stochastic maintenance durations. The same parameter values as in Section 7.1 are used. Due to an increase in uncertainty, the minimum average cost per time period increases from 3.570 to 3.939 (an increase of 10.3%).

In the optimal decisions the change to stochastic maintenance durations becomes most apparent in the planning of preventive maintenance. When we shift from deterministic to stochastic maintenance durations, it becomes uncertain how long maintenance will last. As a result, it is optimal to take a more risk-averse stance to initiating preventive maintenance. In Table 2 (a) and (b) we observe that when the buffer is filled with one item (B = 1) and the inventory is empty (I = 0), with stochastic maintenance durations, there are less situations in which it is optimal to start preventive maintenance on machine 1. Furthermore, in Table 3 (a) and (b), we observe that when the buffer is two (B = 2) and the inventory is two (I = 2), with stochastic maintenance durations, it is optimal to initiate preventive maintenance on machine 1 when X1 = 2, while with deterministic maintenance durations it was optimal to leave machine

(32)

and inventory with stochastic maintenance durations. With stochastic instead of deterministic maintenance durations, it is optimal to be more eager to perform preventive maintenance when the buffer and inventory are relatively full. On the other hand, when the number of items in the buffer and inventory is low, it is better to be more reluctant in initiating preventive maintenance when having stochastic maintenance durations as it is uncertain how long the machine will be out of service. When examining the preventive maintenance planning of machine 2, we observe a similar pattern.

Table 2: Planning for (B, I) = (1, 0)

(a) Deterministic maintenance durations X2 1 2 3 4 X1 1 (J, J ) (J, J ) (J, J ) (N, M ) 2 (J, J ) (J, J ) (J, J ) (M, M ) 3 (J, J ) (J, J ) (J, J ) (M, M ) 4 (M, J ) (M, J ) (M, J ) (M, M )

(b) Stochastic maintenance durations X2 1 2 3 4 X1 1 (J, J ) (J, J ) (J, J ) (N, M ) 2 (J, J ) (J, J ) (J, J ) (J, M ) 3 (J, J ) (J, J ) (J, J ) (J, M ) 4 (M, J ) (M, J ) (M, J ) (M, M )

Table 3: Planning for (B, I) = (2, 2)

(a) Deterministic maintenance durations X2 1 2 3 4 X1 1 (N, J ) (N, J ) (N, J ) (N, M ) 2 (N, J ) (N, J ) (N, J ) (N, M ) 3 (M, J ) (M, J ) (M, J ) (M, M ) 4 (M, J ) (M, J ) (M, J ) (M, M )

(b) Stochastic maintenance durations X2 1 2 3 4 X1 1 (N, J ) (N, J ) (N, J ) (N, M ) 2 (M, J ) (M, J ) (M, J ) (M, M ) 3 (M, J ) (M, J ) (M, J ) (M, M ) 4 (M, J ) (M, J ) (M, J ) (M, M )

We now investigate the effect on the average cost when the processing success probabilities q1 and q2 change. In other words, we assess the relationship between average cost and the

processing speed of the machines. We use the parameter values from Section 7.1 as our basic set-up. In Figure 1 (a) and (b) the final results are presented. In all investigated scenarios, stochastic maintenance durations result in a cost increase of approximately 10% compared to deterministic maintenance durations. The average cost is decreasing when q1 or q2 increases,

which seems natural since having a faster machine will result in a lower average cost. As the throughput of the production line depends on the processing speed of both machines and the processing speed of the other machine is kept constant, the marginal gain for an increase in q1

(q2) decreases in q1 (q2). Furthermore, we observe that the marginal gain is bigger for q2 than

for q1. When an item is unsuccessfully processed on machine 2, the item is placed back into

(33)

probability of machine 2.

(a) (b)

Figure 1: Average cost as a function of q1 (a) and q2 (b).

Next, the goal is to quantify the benefit of having additional buffer and inventory space. We will use the parameter values from Section 7.1 and we will fluctuate Bmaxand Imaxbetween

one and five. The results of altering Bmax are given in Table 4. The outcomes when changing

Imax can be found in Table 5. The cost advantage of having maintenance durations that are

deterministic instead of stochastic is approximately 10% in almost all scenarios. This confirms what has been found in previous experiments. The exception is when the buffer capacity is equal to one, then the difference is 6%.

From the resulting outcomes we can conclude that a relatively low amount of buffer and inventory space will have a large impact on the average cost. In the scenario where Bmax = 1,

the costs can be decreased by approximately 74% by increasing the buffer capacity by one. When Imax = 1, increasing the inventory capacity by one yields a cost decrease of almost 16%.

Hence, a very low buffer capacity is more costly than a very low inventory capacity. When the buffer is full, no items can be processed on machine 1 until machine 2 successfully has processed an item. Building up a large inventory then becomes very challenging. Having a full inventory is less costly as then items can wait in the buffer and only need to be processed by one more machine. Furthermore, one can conclude that the marginal gain of having more space becomes very small after a certain threshold. In our model there is little benefit in setting Bmaxand Imax

larger than three.

(34)

de-Table 4: The effect of changing the buffer capacity on the average cost with Imax= 4. Bmax Average cost deterministic maintenance durations Average cost stochastic maintenance durations 1 14.42 15.31 2 3.65 4.04 3 3.57 3.94 4 3.56 3.93 5 3.56 3.93

Table 5: The effect of changing the buffer capacity on the average cost with Bmax= 3.

Imax Average cost deterministic maintenance durations Average cost stochastic maintenance durations 1 4.36 4.85 2 3.67 4.08 3 3.57 3.94 4 3.57 3.94 5 3.57 3.94

terministic and stochastic maintenance durations. As the variance of the demand distribution increases, the uncertainty in the model increases, and we observe an increase in the average cost. Furthermore, as can be seen in the fifth column of Table 7, the effect of having stochastic instead of deterministic maintenance durations is lower when demand is very volatile. This also works the other way around, when we are dealing with stochastic maintenance durations and therefore with some additional uncertainty, the effect of demand volatility is lower. For example, when dealing with deterministic maintenance durations the average cost for demand distribu-tion 5 is 2.33 times as large as the average cost using demand distribudistribu-tion 1. When there are stochastic maintenance durations, the average cost for distribution 5 is only 2.09 times as large. Concluding, the marginal effect of uncertainty in the model on the average cost diminishes.

Table 6: The different demand distribution and their corresponding mean and variance Demand distribution P(D = 0) P(D = 1) P(D = 2) P(D = 3) E[D] Var(D)

1 0.6 0.4 0 0 0.4 0.24

2 0.7 0.2 0.1 0 0.4 0.44

3 0.75 0.15 0.05 0.05 0.4 0.64

4 0.8 0.1 0 0.1 0.4 0.84

(35)

Table 7: The average cost for five different demand distributions. Demand distribution Var(D) Average cost deterministic maintenance durations Average cost stochastic maintenance durations Increase from deterministic to stochastic 1 0.24 2.53 2.96 17.1% 2 0.44 3.57 3.94 10.3% 3 0.64 4.55 4.85 6.8% 4 0.84 5.25 5.56 5.8% 5 1.04 5.90 6.19 4.9%

8 Discussion & Conclusion

We have studied optimal joint production and maintenance strategies for two-machine produc-tion lines. Both machines deteriorate when used to process items and thus occasionally require maintenance. All items need to be processed sequentially by both machines. A buffer with limited capacity is present between the machines. At the end of the production line, a limited number of items can be stored. The production facility faces random demand.

Two different models are presented, both formulated as a Markov decision process. In the first model, deterministic maintenance durations are considered. In the second model, we have assumed stochastic maintenance durations. In order to clarify the transitions between the states over time, each time period has been split into different phases. In the first two phases, the actions to be performed in the current time period for both machines are chosen. In the remaining phases, the outcomes of the stochastic processes are examined. This includes observing if processing on the machines was successful, possible deterioration and the realized demand. In addition, in the second model, there is an opportunity for maintenance to finish.

We continued by presenting structural properties of the model with stochastic maintenance durations. For both machines, we formulated and presented multiple lemmas. These are com-bined in order to prove that maintenance is carried out according to a control-limit policy. This implies that when in a state where initiating maintenance on machine 1 (2) is optimal, it is also optimal to initiate maintenance if the deterioration state of machine 1 (2) is higher, ceteris paribus.

(36)

when planning preventive maintenance. If machine 1 then fails during the current processing, it is optimal to perform maintenance on the machines simultaneously. Moreover, we found that it is optimal to keep the buffer as empty as possible when assuming the holding costs for the buffer and inventory are equal.

Lastly, we performed a sensitivity analysis to examine the influence of parameter changes. Besides higher average cost, the presence of stochastic maintenance durations leads to a more risk-averse stance concerning preventive maintenance. When a low number of items is in the buffer and inventory, it is optimal to be more reluctant to initiate maintenance because of the uncertain duration. When investigating the effect of the processing speed of the machines, we observed that increasing the speed of machine 2 yields a larger marginal benefit than enhancing the speed of machine 1. The advantage of having additional capacity for the buffer and inventory is limited when the capacities are already sufficiently large. Furthermore, we found that the expected average cost significantly increase in demand volatility. The effect on the expected average cost of having volatile demand is smaller when the maintenance durations are uncertain and vice versa.

There exist many research opportunities related to this paper. Instead of two machines in series, a complex system of machines could be considered. Multiple machines could be placed in series and in parallel. This will make determining exact optimal policies more complicated because of the large corresponding state space. The results from this paper can be used for constructing appropriate heuristics. It would also be possible to introduce economic, structural or stochastic dependence.

Furthermore, in this research we assumed that if processing on the second machine fails, the item is placed back into the buffer. This item may again be processed on the second machine in the future. It would be interesting to examine the influence on the optimal decisions when, after a processing failure on the second machine, the item is not placed back into the buffer but is discarded. A related research opportunity would be to assume that the buffer and the inventory share their capacity.

(37)

References

Assid, M., Gharbi, A., and Hajji, A. (2015). Joint production, setup and preventive maintenance policies of unreliable two-product manufacturing systems. International Journal of Production Research, 53(15):4668–4683.

Batun, S. and Maillart, S. (2012). Reassessing Tradeoffs Inherent to Simultaneous Maintenance and Production Planning. Production and Operations Management, 21(2):396–403.

Borrero, J. S. and Akhavan-Tabatabaei, R. (2015). Time and inventory dependent optimal maintenance policies for single machine workstations: An MDP approach. European Journal of Operational Research, 228(3):545–555.

Bouslah, B., Gharbi, A., and Pellerin, R. (2018). Joint production, quality and maintenance control of a two-machine line subject to operation-dependent and quality-dependent failures. International Journal of Production Economics, 195:210–226.

Buzek, G. (2015). We Lost Australia. Technical report, IHL Group.

Cassady, C. R. and Kutanoglu, E. (2005). Integrating Preventive Maintenance Planning and Production Scheduling for a Single Machine. IEEE Transactions on Reliability, 54(2):304–309. De Jonge, B. (2019). Discretizing continuous-time continuous-state deterioration processes, with an application to condition-based maintenance optimization. Reliability Engineering and System Safety, 188:1–5.

De Jonge, B. (2020). Joint Condition-Based Production and Maintenance Optimization for a Deteriorating Production System. unpublished.

De Jonge, B. and Scarf, P. (2020). A review on maintenance optimization. European Journal of Operational Research, 285(3):805–824.

Gharbi, A. and Kenn´e, J. (2005). Maintenance scheduling and production control of multiple-machine manufacturing systems. Computers & Industrial Engineering, 48(4):693–707. Hafidi, N., El Barkany, A., El Mhamedi, A., and Mahmoudi, M. (2020). Joint optimization of

production and maintenance for multi-machines subject to degradation and subcontracting constraints. Journal of Quality in Maintenance Engineering, ahead-of-print.

(38)

Hopp, W. and Spearman, M. (2008). Factory Physics: Foundations of Manufacturing Manage-ment. Mcgraw-Hill, New York, NY, USA, 3th edition.

Iravani, S. and Duenyas, I. (2002). Integrated maintenance and production control of a deteri-orating production system. IIE Transactions, 34(5):423–435.

Jafari, L. and Makis, V. (2015). Joint optimal lot sizing and preventive maintenance policy for a production facility subject to condition monitoring. International Journal of Production Economics, 169:156–168.

Kazaz, B. and Sloan, T. (2008). Production policies under deteriorating process conditions. IIE Transactions, 40(3):187–205.

Kazaz, B. and Sloan, T. (2013). The impact of process deterioration on production and main-tenance policies. European Journal of Operational Research, 227(1):88–100.

Lu, B. and Zhou, X. (2017). Opportunistic preventive maintenance scheduling for serial-parallel multistage manufacturing systems with multiple streams of deterioration. Reliablity Engineer-ing and Safety System, 168:116–127.

Mifdal, T. W., Hajej, Z., Dellagi, S., and Rezg, N. (2013). An optimal production planning and maintenance policy for a multiple-product and single machine under failure rate dependency. IFAC Proceedings Volumes, 46(9):507–512.

Najid, N. M., Alaoui, M. S., and Mouhafid, A. (2011). An Integrated Production and Main-tenance Planning Model with time windows and shortage cost. International Journal of Production Research, 49(8):2265–2283.

Puterman, M. L. (1984). Markov Decision Processes. Wiley, New York, NY, USA.

Sheu, S., Chang, C., Chen, Y., and Zhang, Z. H. (2015). Optimal preventive maintenance and repair policies for multi-state systems. Reliability Engineering and System Safety, 140:78–87. Sloan, T. and Shanthikumar, J. (2000). Combined production and maintenance scheduling for a multiple-product, single-machine production system. Production and Operations Management, 9(4):379–399.

(39)

Xiao, L., Song, S., Chen, X., and Coit, D. (2016). Joint optimization of production scheduling and machine group preventive maintenance. Reliablity Engineering and Safety System, 146:68– 78.

Yulan, J., Zuhua, J., and Wenrui, H. (2008). Multi-objective integrated optimization research on preventive maintenance planning and production scheduling for a single machine. Inter-national Journal of Advanced Manufacturing Technology, 39(9-10):954–964.

Zhao, S. and Wang, L. (2014). Integrating production planning and maintenance: an iterative method. Industrial Management & Data Systems, 114(2):162–182.