The two-machine one-buffer continuous time model with restart policy

This paper deals with the performance evaluation of production lines in which well defined machine start/stop control policies are applied. A modeling approach has been developed in order to reduce the complexity of a two-machine one-buffer line where a specific control policy, called “restart policy”, is adopted. The restart policy exercises control over the start/stop condition of the first machine: when the buffer gets full and, as a consequence, the first machine is forced to stop production (i.e., it is blocked), the control policy keeps the first machine in an idle state until the buffer becomes empty again. The rationale behind this policy is to reduce the blocking frequency of the first machine, i.e. the probability that a blockage occurs on the first machine due to the buffer filling up. Such a control policy is adopted in practice when outage costs (e.g., waste production) are related to each restart of the machine. The two-machine one-buffer line with restart policy (RP line) is here modeled as a continuous time Markov process so as to consider machines having different capacities and working in an asynchronous manner. The mathematical RP model is described along with its analytical solution. Then, the most critical line performance measures are derived and, finally, some numerical examples are reported to show the effects of such a policy on the blocking frequency of the first machine.


Introduction
This paper presents a Markov model for inhomogeneous two-machine one-buffer production lines where a control policy is applied to reduce the blocking frequency of the first machine.
Production lines represent a particular configuration of manufacturing systems in which machines are connected to one another so as to form a series. In this work, machines are assumed to have deterministic processing rates. This configuration is generally adopted in high-capacity plants with continuous or large-batch production. Frequently, production lines are also automated so as to achieve higher production rates and repetitiveness of task processes.
The performance of a production line strictly depends on the nominal capacity and the reliability parameters of the machines involved. Please note that the term "capacity" related to a machine refers here to its "maximum production rate". Specifically, this work considers machines that can operate at different nominal capacities and whose time-to-failure and time-to-repair distributions are exponential. As regards the latter assumption, practical evidence shows that failures can be modeled as memory-less processes in several real world applications. On the other hand, the assumption of exponentially distributed repair times seems less realistic (as reported in Sect. 1.1 several studies have been carried out about general repair times). However, there are studies (see, e.g., Perrica et al. 2008, about automated production lines) showing that the exponential distribution can be a good approximation for the repair times also. This is because service times for repairs may depend on a number of different activities characterized by high variability (e.g., diagnostic activities).
Moreover, also interactions between machines play an important role in determining the whole line performance. This is due to the fact that the effect of a failure occurring on a specific machine may propagate to other machines located both in the upstream and in the downstream of the line. We refer to a "starvation" when an operational machine is found idle as a consequence of an interruption of its incoming flow since a machine in the upstream of the line has failed; we refer to a "blockage" when its outgoing flow is prevented owing to a failure in the downstream. In order to reduce efficiency losses due to machine interactions, buffers can be positioned along the line so as to mitigate the effects of starvation and blocking events by accumulating and releasing material as needed.
Another strategy that can be employed to improve the line efficiency is to adopt "control policies", i.e. to exercise a kind of control over the behavior of the involved machines when certain conditions occur. Specifically, this work focuses on a particular control policy typical of processes that incur significant extra costs if interrupted for any reason. Such a control policy is called here "restart policy" and acts as follows: when a machine is blocked because another machine has failed in the downstream and the intermediate buffer is full (so as no decoupling effect takes place), it is not allowed to leave the blocking state as soon as the buffer level starts to decrease. Hence, this restart policy forces a machine once it has become blocked, to remain idle until the downstream buffer becomes empty again. The aim is to reduce the probability of subsequent blocking events that can occur if the machine resumes production when the level of the buffer is just below its maximum size, i.e., when there is little storage space in the downstream.
The motivation to introduce this kind of control policy arises from a real world case study of automated packaging systems. In a typical line configuration package formation and filling is executed by the first machine, called "filling machine", by means of a continuous and aseptic process where the packaging material has to pass through a heated hydrogen peroxide bath. In such a production system, if the process executed by the filling machine is interrupted for any reason (e.g., a failure or a blocking event) a portion of the packaging material remains in contact with the hydrogen peroxide for too long. As a consequence, the first packages produced when the stoppage is removed do not comply with the quality requirements. Thus, in this case, such as in other applications with significant outage costs, the need arises to reduce to a bare minimum the stoppages of the first machine of the line. On one hand, the reduction of internal stoppages can be only obtained by improving the failure rate of the very machine. On the other hand, a reduction of those stoppages due to the blocking events can be obtained by introducing an intermediate buffer and the restart policy studied in this paper. This restart policy forces the first machine to wait for the buffer to become empty before resuming production after any blocking event so as to ensures that the maximum storage space is available in the downstream and, consequently, to mitigate the risk of a subsequent interruption of the process. In such packaging lines, those packages that do not meet the quality standards are rejected as waste. Nevertheless, the computation of the amount of waste production is not taken into consideration in this work (please refer to Gebennini and Gershwin 2013, for analytical models including waste production).
In this paper we are interested in a more general performance measure, called "blocking frequency", that describes the rate at which a blocking event occurs (i.e., the rate at which the buffer becomes full). The blocking frequency can be used to evaluate the performance of a production line not only when physical waste is produced but, more generally, when additional costs are incurred as a consequence of a stoppage on the first machine. Total extra costs for the process interruption are proportional to the blocking frequency computed in this paper.
Some interesting considerations about the effect of the restart policy on the blocking frequency are pointed out in Sect. 6 where production lines with and without restart policy are compared.
Hence, the production line under analysis results to be a complex system whose overall performance not only depends on machines' capacities and reliability parameters, but also on the buffer allocation along the line and, moreover, on the restart policy applied to the first machine.
Analytical modeling of production lines has been extensively investigated in the literature as discussed in Sect. 1.1. Nevertheless, to the author's knowledge, none of the previous analytical models is able to describe an inhomogeneous production line with a restart policy.

Literature review
Since there are several factors affecting the whole line performance (i.e. machines reliability parameters, machine capacities and buffer allocation along the line), the design of a production line is a complex task. As a consequence, there is an increasing need to develop efficient methodologies and approaches to quickly compute line efficiency and productivity with respect to all the key factors. Specifically, this work focuses on mathematical methods even if there are other methodologies such as simulation. The great advantage of mathematical modeling is that it can lead to the deepest understanding of the system and provide results in very short periods of time (e.g., the model proposed in this paper can be solved by a computer in few seconds). On the other hand, simulations may be more suitable to study large and complex systems but are often awkward and time-consuming.
Unfortunately, exact mathematical models are available only for short lines, i.e. consisting of two machines decoupled by one finite buffer (2M-1B line). However, the main techniques addressing long line performance estimation make use of models developed for the 2M-1B sub-system. This emphasizes the importance of investigating such a simple but not trivial system. Thus, analytical modeling makes it possible to quickly obtain accurate estimations of the line performance for 2M-1B lines. For extensive reviews, the reader can refer to Dallery and Gershwin (1992) and textbooks as Papadapoulos et al. (1993) and Gershwin (2002). Analytical models of production lines can be adopted to evaluate the performance with respect to a predefined line configuration, as well as to be integrated in optimization techniques to derive optimal buffer allocation (Papadopoulos and Vidalis 1998, 1999, 2001a, 2001bBulgak et al. 1995;Gershwin and Schor 2000;Spinellis and Papadopoulos 2000;Lutz et al. 1998).
As introduced above, exact analytical solutions are not available for lines longer than the 2M-1B system. Nevertheless, approximate methodologies have been developed. These methodologies can be classified into two main groups: "decomposition techniques", whose main idea is to decompose the line in a series of 2M-1B sub-systems (important works are Gershwin 1987;Dallery et al. 1989;Choong and Gershwin 1989;Dallery and Frein 1993;Burman 1995;Yeralan and Tan 1997a;Tan and Yeralan 1997;Gershwin and Burman 2000;Levantesi et al. 2003;Maggio et al. 2003;Gershwin and Werner 2007), and "aggregation techniques", where the 2M-1B sub-system is replaced by one single equivalent machine (see Koster 1987;Chiang et al. 2000Chiang et al. , 2001Enginarlar and Meerkov 2005;Chiang et al. 2008). Finally, there exist other approaches, such as the Expansion Method by MacGregor Smith and Daskalaki (1988).
Since production lines are generally characterized by inhomogeneous asynchronous machines, system models must be able to take into consideration machines having different nominal capacities working on an unsynchronized material flow. This can be obtained either by means of appropriate homogenization techniques (see Dallery and Bihan 1997) or by modeling the system as a continuous time Markov process. In this work we will adopt the latter modeling approach. It was originally proposed by Zimmern (1956) and extended by Buzacott (1967aBuzacott ( , 1967b and by Wijngaard (1979). A thorough discussion of the justification of fluid models can be found in Mitra (1988). Gershwin and Schick (1980) developed a continuous model for a 2M-1B production line with deterministic processing rates. This line is called here the "Basic line" and the corresponding model the "Basic model".
Since then, extensions of the Basic model were carried out to represent more complex behaviors of the line. These studies significantly contribute to the literature since the more appropriate the model for the 2M-1B sub-system is, the more representative the approach for the analysis of the whole line will be. As an example, Yeralan and Tan (1997b) proposed a station model for continuous materials flow production systems. Ozdogru and Altiok (2003) analyzed a system with exponential failure time and two-stage Coxian repair time distribution. Recent studies have been carried out to generalize continuous time Markov models by including machines with multiple up and down states (see, e.g., Gershwin 2009, 2011;Tolio 2011;Tolio et al. 2002;Levantesi et al. 2003) and by investigating the relationship between productivity and quality (see, e.g., Kim and Gershwin 2005).
In this paper an interesting extension of the Basic model is discussed. Specifically, the introduction of a "restart policy" is investigated. Such a control policy consists in maintaining the first machine in a "controlled idle state" each time it gets blocked, as a consequence of the buffer filling up, until the buffer empties again. The objective is to reduce a key performance measure called here blocking frequency that, as introduced above, describes the rate at which the buffer becomes full.
The first attempt to model this kind of restart policy was carried out by Gebennini et al. (2009Gebennini et al. ( , 2011. Nevertheless, the model developed by the authors is a discrete time Markov model suitable for transfer lines only, that is lines in which machines have the same nominal capacity and the material flow is synchronized. Hence, by introducing this restart policy into the Basic model, the present study extends the work of Gebennini et al. (2009Gebennini et al. ( , 2011 to the case of inhomogeneous production lines, so as to allow the consideration of machines having different capacities. This line with restart policy is called here the "RP line" and the new corresponding model the "RP model". Note that the introduction of this control policy significantly increases the complexity of the mathematics involved. This emphasizes the need for methodologies to facilitate mathematical modeling. The literature regarding production lines lacks such a discussion. Therefore, this paper presents a modeling approach for the 2M-1B RP line with restart policy based on the partitioning of the state space so as to reduce its complexity. Two partitions of the state space can be identified: first, each of them is mathematically treated as it were the only one representing the system behavior (i.e. in isolation); then, the solution to the original system is determined as a combination of the solutions found for the partitions solved in isolation.
The RP line is modeled as a continuous time Markov process and the analytical solution is provided by applying the modeling approach formalized in this paper.
The remainder of the paper is organized as follows. Section 2 describes the 2M-1B RP line and introduces the main steps of the modeling approach. Section 3 develops the continuous-time mixed state RP model. Section 4 provides key performance measures and proves the conservation of flow, while Sect. 5 reports the solution technique. Finally, Sect. 6 shows interesting results from the application of the proposed model and Sect. 7 points out some concluding remarks.

The 2M-1B RP line
The two-machine one-buffer line with restart policy (2M-1B RP line) is assumed to be made up of two Markovian machines decoupled by a finite intermediate buffer. As described in Sect. 1, the behavior of the system of interest is made complex by the application of a control policy, called restart policy (RP), whose aim is to reduce the number of stoppages of the first machine due to blocking events.
A blocking event occurs when the first machine is operational but it is prevented from processing parts because the downstream buffer is full. Specifically, the restart policy acts as follows: when the buffer fills up and the first machine gets blocked, the first machine is forced to remain idle until the buffer becomes empty again. When the first machine is forced to remain idle it is said to be in the so-called "controlled idle state". In other words, the first machine is allowed to resume production only when the maximum storage space is available in the downstream. Thus, the probability of the next blocking event occurring in a short time is significantly reduced.
As a consequence of the restart policy, it is possible to identify two different system behaviors: -The standard operation way: both machines can interact, work, fail and be repaired according to their own failure/repair rates and to the buffer level; -The buffer drainage way: the second machine acts as it were isolated while the first machine remains idle, i.e. in the "controlled idle state". As a consequence, the buffer level can only decrease (if the second machine is up) or stay constant (if it is down).
The 2M-1B RP line is supposed to operate according to standard operation behavior. The switch to the buffer drainage behavior occurs when the buffer fills up; the switch from the buffer drainage behavior to the standard operation behavior occurs when the buffer becomes empty.
As discussed in Sect. 1, the restart policy aims to reduce the blocking frequency, i.e. the frequency at which the buffer becomes full. Thus, it is particularly useful when some waste parts or outage costs are produced each time the first machine resumes operation after a stoppage.

Modeling approach
This section introduces to the main steps and definitions of the modeling approach applied to the continuous-time mixed state 2M-1B RP model in Sect. 3.
The probabilistic model of the system is studied in steady state. It consists of a state space S and a transition probability matrix P whose entry p(S i , S j ) is the probability of a transition from state S i ∈ S to state S j ∈ S .
The procedure for obtaining the steady state probability distribution of the 2M-1B RP line is based on the partitioning of the state space S in order to split the original model into two sub-models. The steps are as follows.
Step 1 (Partitioning the state space) It is assumed that the state space S can be partitioned into two non-empty subsets P 1 , P 2 ∈ S so that As described in more detail in Sect. 3, it is convenient to relate the two partitions P 1 and P 2 to the two possible ways the system behaves, i.e. the standard operation behavior (P 1 ) and the buffer drainage behavior (P 2 ).
The transitions linking states of one partition to states of the other partition are called here switching transitions. Hence, there is an occurrence probability of the switching transition from P 1 to P 2 , denoted as P 1 2 s , and an occurrence probability of the switching transition from P 2 to P 1 , denoted as P 2 1 s . Specifically, in partition P i , with i = 1, 2, it is possible to identify a set of internal states. The set of internal states I i of partition P i is defined as the collection of states that are connected for any feasible transition to another state of the same partition, or, more precisely: Transitions between the two partitions, i.e. switching transitions, involve the definition of exit states, i.e. the set of state for which there exists an outbound transition to states outside the partition, and arrival states, i.e. the set of states for which there exist an inbound transition from states outside the partition.
In a formal manner, we define the set of exit states of partition P i towards partition P j , with i, j = 1, 2 and i = j , as The set of arrival states of partition P i from partition P j , with i, j = 1, 2 and i = j , is defined as (5) Figure 1 shows the partitions P 1 and P 2 and the switching transitions between the exit states of one partition to the arrival states of the other partition. Step 1 and Step 2 of the modeling approach Step 2 (Partitions in isolation) The aim is to treat each partition independently, i.e. in "isolation". In such a way the mathematical treatment of the problem is extremely simplified. The original Markov chain model, described by the whole state space S is split into two Markov chain sub-models, each of them described by a single partition P i ⊂ S , i = 1, 2.
The main assumption for the application of this step is that the system is studied in steady state. Under this condition the following proposition holds.
Proposition 1 For the system to be in steady state, the occurrence probability P ij s of the switching transition from partition P i to partition P j equals the occurrence probability P ji s of the switching transition from partition P j to partition P i , for i, j = 1, 2 and i = j .
Thus in steady state we have, Let us consider partition P i with i = 1, 2. Given Eq. (6), it is possible to replace the switching transition to P j with j = i by a transition from its own exit states E ij to its own arrival A ij state, given that the new occurrence probability equals the probability P ij s of the original switching transition. Thus, as shown in Fig. 1 partition P i with i = 1, 2 results to be "isolated" (or "in isolation") in the sense that each state of the partition is connected only to other states belonging to the same partition.
The two partitions in isolation represent two renewal processes so that the corresponding Markov chain models can be solved independently. The solution to the Markov chain sub-model described by partition P i in isolation is the steady state probability distribution denoted here asf i , with i = 1, 2.
Step 3 (Partition probabilities) Considering the original system behavior, at each renewal of partition P i in isolation, partition P j takes place, with i, j = 1, 2 and i = j . Thus, in order to rebuild the original Markov model describing the 2M-1B RP line, it is necessary to ensure that the system is in exactly one of the two partitions at a time. As a consequence, it is necessary to introduce the term π i , called here as the "partition probability" of P i , representing the probability for the system being into partition P i with i = 1, 2. The partition probabilities satisfy the following normalization equation: Step 4 (System solution) Once both the isolated solutions and the partition probabilities have been found, it is possible to express the system solution f , i.e. the solution to the original Markov chain model, as a combination of the solutionsf i , with i = 1, 2, found for each partition in isolation:

The continuous-time mixed-state RP model
In this section the modeling approach described in Sect. 2.1 is applied to the continuous-time mixed-state model of the 2M-1B RP line.
Recall that if no control policy is in place, the continuous 2M-1B RP line becomes a simpler 2M-1B line previously investigated by the literature that, as introduced in Sect. 1, is called here Basic model (see Gershwin and Schick 1980;Gershwin 2002).
The RP model significantly extends the Basic model. As explained in Sect. 2, the restart policy introduces two different behaviors in the system, i.e. the standard operation, where no control policy is applied, and the buffer drainage, where the first machine is put in the controlled idle state allowing the buffer level to decrease.

Notation and assumptions
One of the main advantages of modeling the system by a continuous-time mixed-state model is that inhomogeneous asynchronous machines can be properly addressed.
The main assumptions of the 2M-1B RP model are as follows: 1. A machine is said to be "starved" if its incoming flow is interrupt (the buffer is empty and the upstream machine is down). A machine is said to be "blocked" if its outgoing flow is interrupt (the buffer is full and the downstream machine is down). For the 2M-1B RP line it is assumed that the first machine is never starved and the second machine is never blocked; 2. The material that is processed is treated as though it is a continuous fluid; 3. The machine are either up (i.e. operational) or down (i.e. under repair) so that there is no middle ground in the model; 4. The machines may have different deterministic production rates so that the line can be inhomogeneous; 5. The machines have exponentially distributed times between failures and time to repair; 6. The failures are assumed to be operation-dependent so as the first machine cannot fail while it is blocked or in the controlled idle state, and the second one cannot fail while it is starved; 7. The intermediate buffer has a finite capacity (note that, as a consequence of assumption 2 the buffer level is continuous in the range defined by zero and the maximum capacity); 8. The material in process is not destroyed or rejected at any stage in the system; 9. When the restart policy takes place the system switches from the standard operation behavior to the buffer drainage behavior. This happens when the buffer fills up. The system returns to the standard operation behavior when the buffer becomes empty again.
The following notation is adopted: is the production rate of machine i, with i = 1, 2 (see assumption 4); p i is the failure rate of machine i, with i = 1, 2 (see assumption 5); r i is the repair rate of machine i, with i = 1, 2 (see assumption 5).
As regards failure probabilities and slowdown phenomena, there are some important differences between the Basic model and RP model. It can be useful to recall that in the Basic model: -if μ 2 < μ 1 and the buffer is full, the first machine is forced to slow down its speed to μ 2 .
Thus, if x = N the probability of failure of the first machine at time t + δt, provided that -if μ 1 < μ 2 and the buffer is empty, the second machine is forced to slow down its speed to μ 1 . Thus, if x = 0 the probability of failure of the second machine at time t + δt, provided that α 2 (t) = 1, is p b 2 δt, where This is because failures are assumed to be operation dependent (see assumption 6). When the buffer is not empty, the conditional failure probability of the second machine is p 2 δt. When the buffer is not full, the conditional failure probability of the first machine is p 1 δt. In the RP model the assumption of operation dependent failures is retained. Nevertheless, we assume that as soon as the buffer fills up and the second machine is operational (α 2 = 1) the restart policy takes place. As a consequence, in the RP model the first machine is not allowed to slow down. If μ 2 > μ 1 the only non-transient state with the buffer level at the maximum capacity N is the one with the first machine down and the second machine up. In this case the switch to the buffer drainage behavior takes place when the second machine is repaired. If μ 1 > μ 2 , the buffer may fill up even if both machines are operational. Nevertheless, on the contrary of the Basic model, we assume that the system cannot persist in that state (i.e., it becomes transient) since the system instantaneously switches to the buffer drainage behavior and the first machine is put in the controlled idle state. Thus, in the RP model the probability of failure of the first machine in δt, given that it is operational in t , is always p 1 δt.
On the other hand, as occurs in the Basic model, if μ 1 < μ 2 and x = 0 the second machine slows down to μ 1 so that its failure rate must be corrected according to (10).
The probability to have a repair at time t + δt of a machine i failed at t (α i (t) = 0) is r i δt for both the Basic model and the RP model.

3.2
Step 1: Partitioning the state space It is convenient to partition the state space according to the two different ways the system behaves so that we have: -the standard operation partition; -the buffer drainage partition. Thus, the system state can be defined as S = (β, x, α 1 , α 2 ) where, as described in Sect. 3.1, x ∈ R is the buffer level with 0 ≤ x ≤ N ; α i = 0, 1 is the condition of the machine i = 1, 2; and β is a binary parameter that is conveniently introduced here to distinguish between states belonging to the standard operation partition (β = 0) or the buffer drainage partition (β = 1). It is assumed that when β = 1, i.e. the first machine is in the controlled idle state, α 1 is fixed and set to 1 (it is forced to remain idle, but it is operational and cannot fail) so that states (1, x, 0, α 2 ) are not feasible. Note that the system state includes three binary parameters and a continuous component x. Thus, the probability distribution has a density function on (0, N), denoted as f (β, x, α 1 , α 2 ). The aim of this work is to express the solution f (β, x, α 1 , α 2 ) in order to be able to compute some performance measures.
The switching transition from the standard operation partition to the buffer drainage partition occurs if the buffer fills up.
Let us consider the simplest case where μ 1 ≤ μ 2 . The buffer might fill up only if the first machine is up and the second is down. In fact, since the first machine is slower than the second one, if both the machines are operational the buffer level tends to decrease and the maximum buffer size N cannot be reached. In other words, the only non-transient state with the buffer level x = N is that with α 1 = 1 and α 2 = 0. For convenience, we assume that this state belongs to the standard operation partition so that it is denoted as (0, N, 1, 0) where the first term β is set to zero.
Note that state (0, N, 1, 0) is peculiar for the following considerations: -in this state the first machine is blocked, i.e. it is operational but it cannot work since its outgoing flow is prevented being the second machine down and no storage space in the buffer; -since the maximum buffer size N is defined by a physical limit, this state acts as a mass point which itself has nonzero probability (see Gershwin 2002). Let p(0, N, 1, 0) be the probability for the system being in state (0, N, 1, 0).
The behavior of the 2M-1M RP line under analysis is such that when the second machine is repaired (and, consequently, the buffer level starts to decrease) the first machine is not allowed to resume production but it is forced to remain idle, i.e. put in the "controlled idle state". Thus, as soon as the second machine is repaired the switching transition from the standard operation partition to the buffer drainage partition occurs. This means that state (0, N, 1, 0) is the exit state of the standard operation partition. The arrival states of the buffer drainage partition are states where the buffer level is reduced by at most the amount processed by the second machine in δt and both machines up since the first machine cannot fail (it is prevented from working) and the second machine has been repaired. Thus, the Table 1 Exit and arrival states of the standard operation and buffer drainage partition buffer drain. standard op. buffer drain.

Exit states
Since the switching transition occurs if the system is in state (0, N, 1, 0) and the second machine is repaired, we have: N, 1, 0).
Once the system has entered the buffer drainage partition, it remains in this partition until the buffer becomes empty. Note that the first machine is in the "controlled idle state" so that it cannot work nor fail and the buffer level can only decrease (if the second machine is up) or stay constant (if the second machine is down).
Hence, the switching transition from the buffer drainage partition to the standard operation partition occurs when x reaches the physical limit at zero since the second machine is up (α 2 = 1). Specifically, we assume that the switching transition occurs between state (1, x, 1, 1) with 0 < x ≤ μ 2 δt of the buffer drainage partition and state (0, 0, 1, 1) of the standard operation partition where state (0, 0, 1, 1) is a mass point, similarly to (0, N, 1, 0).
Since the system is studied in steady state, Proposition 1 holds so that we have: The above considerations have been done for the case with μ 1 ≤ μ 2 , the exit and arrival states for the case with μ 1 > μ 2 are reported in Table 1.
Note that when the system is in internal states of the standard operation partition (i.e. with intermediate buffer levels) it can be seen as a simple 2M-1B line where machines work, fail and are repaired according to their own reliability parameters. This occurs in a previous model presented by the literature that is referred here as the Basic model (see Gershwin and Schick 1980;Gershwin 2002). Thus, we recall the Basic model for what concerns state of the standard operation partition with intermediate buffer level, but we extend it significantly by considering the switching transitions to/from the buffer drainage partition introduced by the restart policy.

Step 2: Partitions in isolation
In this section the two partitions are treated separately, i.e. in isolation.

By applying
Step 2 of the modeling approach, we isolate each partition by using a direct transition from its own exit states and its own arrival states. Once a partition as been "isolated" it can be solved by taking it as "standing alone".
For convenience, we use a simplified notation for the partitions in isolation. Specifically, the system state for the standard operation partition in isolation is defined as S S = (x, α 1 , α 2 ), for the buffer drainage partition in isolation as S D = (x, α 2 ), being x the buffer level and α i = 0, 1 the condition of machine i = 1, 2. S S = (x, α 1 , α 2 ) corresponds to state (0, x, α 1 , α 2 ) of the original Markov process, S D = (x, α 2 ) corresponds to state (1, x, 1, α 2 ) of the original Markov process. Note that states in the buffer drainage partition in isolation do not depend on α 1 since the first machine is in the controlled idle state (it cannot work nor fail) and the second machine operates as it were isolated.
Let f S (x, α 1 , α 2 , t) and p S (x, α 1 , α 2 , t) be the probability density function and the probability of being in state (x, α 1 , α 2 , t) belonging to the standard operation partition, f D (x, α 2 , t) be the probability density function for the buffer drainage partition.

Standard operation partition in isolation
This partition can be solved in isolation by considering it as the only one representing the system behavior. In words, transitions to/from the buffer drainage partition (characterizing the original complex behavior) are replaced with direct transitions from the exit to the arrival states of the standard operation partition itself. Thus, the standard operation partition in isolation can be thought of as modeling a system where each time the buffer fills up, it empties instantaneously as the second machine gets repaired (or does not fail, in case μ 1 > μ 2 ).
Specifically, for μ 1 ≤ μ 2 we have that if the system is in state (N, 1, 0), it can enter only state (0, 1, 1), when the second machine is repaired; for μ 1 > μ 2 we have that if the system is state (N, 1, 0) or if both machines are operational and the buffer level is approaching the maximum capacity (i.e., the system is in states (x, 1, 1) with N − (μ 1 − μ 2 )δt ≤ x < N) it will pass to state (x, 1, 1) with 0 < x ≤ (μ 1 − μ 2 )δt if the second machine is repaired or no failures occur, respectively.
As introduced above, the standard operation partition in isolation can be modeled by means of the Basic model without restart policy (see Gershwin and Schick 1980;Gershwin 2002) except for the boundary equations that represent how the system leaves/enters the exit/arrival states.
For the sake of clarity, the most significant equations modeling this partition are discussed in the sequel by treating the three cases μ 1 < μ 2 , μ 1 = μ 2 and μ 1 > μ 2 separately.
All equations for this partition are listed in Appendix A. Finally, the following normalization equation expressing that the sum of all probabilities must equal 1 is needed to solve the partition in isolation: The technique used to determine the solution to the standard operation partition in isolation is explained in detail in Sect. 5.
Case μ 1 ≤ μ 2 As regards the lower boundary (x = 0), it is necessary to describe how the system arrives at the arrival state (0, 1, 1).
Thus, states (x, 0, 1), with N − μ 2 δt ≤ x < N, cannot be reached from the boundary (because of the restart policy) or from any intermediate-buffer-level state (they cannot be reached from states (x , α 1 , α 2 ) in δt, if x ≤ x and δt is small, because when the second machine is working the buffer level decreases).
Symbolically, if the second order terms are ignored, we obtain If μ 1 < μ 2 , we notice that also states (x, 1, 1), with N − (μ 2 − μ 1 )δt ≤ x < N, cannot be reached from the boundary since the restart policy forces the system to enter the arrival state (0, 1, 1). Thus, Similarly, if μ 1 = mu 2 state (N, 1, 1) can be reached only from itself in δt, if no failures occur. As a consequence, p S (N, 1, 1 Case μ 1 > μ 2 It is convenient to start by discussing the upper boundary. Recall that, the standard operation partition in isolation with μ 1 > μ 2 can be seen as representative of a fictitious system where as soon as the buffer gets full with the second machine operational, it falls down to zero (note that reaching states with x = N is possible only if the first machine is operational). This means that state (N, 1, 1) is transient. Thus, the system passes directly from states (x, 1, 1) with N − (μ 1 − μ 2 )δt ≤ x < N to states (x, 1, 1) with 0 < x ≤ (μ 1 − μ 2 )δt . Note state (0, 1, 1) is also transient since the system cannot persists in that state if μ 1 > μ 2 . Therefore, Another effect of the restart policy is that also states (x, 0, 1) with N − μ 2 δt ≤ x < N can be reached from the boundary. Thus, to the first order, f S (N, 0, 1) = 0.
As regards the lower boundary (x = 0), to arrive at state (x, 1, 1) with 0 < x ≤ (μ 1 − μ 2 )δt at time t + δt, the system may have been in one of three sets of states at time t . It could have been in state (0, 0, 1) with a repair of the first machine. It could have been in state (N, 1, 0) with a repair of the second machine. It could have been in any state (x, 1, 1) with N − (μ 1 − μ 2 )δt ≤ x < N if no failures occur. The latter two transitions are not feasible in the Basic model since they are related to the restart policy introduced in this work. Note that, since μ 1 > μ 2 , it is not possible to reach state (x, 1, 1) with 0 < Symbolically, ignoring the second order terms, Letting δt → 0, the equation becomes (μ 1 − μ 2 )f S (0, 1, 1) = r 1 p S (0, 0, 1) + r 2 p S (N, 1, 0) + (μ 1 − μ 2 )f S (N, 1, 1). (22)

Buffer drainage partition in isolation
In the buffer drainage partition the first machine is operational but in the controlled idle state, thus it does not process material and, as a consequence, failures cannot occur. Hence, the system state for this partition does not depend on the state of the first machine and can be represented simply as (x, α 2 ), with α 2 = 0, 1. Moreover, states belonging to the buffer drainage partition are characterized by a buffer level x with 0 < x < N and, consequently, only states with intermediate buffer levels are involved.
This leads to the following equations for the buffer drainage partition in isolation: where the t argument is suppressed and f D (x, α 2 ) represents the probability density function of state (x, α 2 ).
Since the steady state versions of Eqs. (23) and (24) have to be simultaneously satisfied, it leads to the following: Therefore, f D (x, α 2 ) is constant.
The buffer drainage partition in isolation can be thought of as describing the reduction of the buffer level according to the production, failure and repair rates of the second machine only. Thus, the probability density function describing the system behavior in this case depends only on α 2 and it can be indicated simply as follows: Equations related to this partition are listed in Appendix A.
Similarly as in the previous case, the isolation procedure requires the following normalization equation in order to find the partition solution: This makes it possible to obtain the solution to the buffer drainage partition in isolation, as detailed in Sect. 5.

Step 3: Partition probabilities
Once the solutions to both the isolated partitions have been found, it is necessary to consider that the system can be in exactly one of the two partitions at a time, i.e. we have to compute the probability of being either in the standard operation partition or in the buffer drainage partition.
Let T be the mean time between the occurrences of the same switching transition in steady state. The following equation holds: where T S is the mean time spent in states of the standard operation partition during T, and T D is the mean time spent in states of the buffer drainage partition during T. Thus, the probability of being in each partition can be expressed in terms of T, T S and T D as follows: Since T S can be seen as the mean time between two switching transitions from the standard operation partition to the buffer drainage partition, we have: where φ S,D is the frequency of the switch from the standard operation partition to the buffer drainage partition, being the system in states of the standard operation partition. In other words, given that the system is into the standard operation partition, φ S,D is the probability of entering (and not persisting) state (N, 1, 0) or, if μ 1 > μ 2 , the probability of being in states (x, 1, 1), with N − (μ 1 − μ 2 )δt ≤ x < N, and no failures occur. Thus, By similar reasoning, T D is defined as the mean time between two switching transitions from the buffer drainage partition to the standard operation partition. Thus, where φ D,S is the frequency of the switch from the buffer drainage partition to the standard operation partition, being the system in states of the buffer drainage partition. Recall that in the buffer drainage partition we consider internal states only and the probability of entering any of them is constant, independent on the buffer level x. So, Finally, the expressions of π S and π D become: (N,1,0) ,

Step 4: System solution
In the case of interest, the system behavior is represented by either the standard operation partition or the buffer drainage partition according to the buffer level. Specifically, the standard operation partition works until the buffer gets full and a blocking event occurs. The buffer drainage partition represent the system behavior if a blocking event has occurred, until the buffer becomes empty again. As explained in Step 4 of the modeling approach described in Sect. 2.1, the solution to the original system is a combination of the solutions found for the two partitions in isolation. Given that the system state is defined as in Sect. 3.2, we have: where π S is the partition probability of being in the standard operation partition and π D = 1 − π S is the partition probability of being in the buffer drainage partition as defined in Sect. 3.4. Recall that the first machine cannot fail during the buffer drainage partition since it is in the controlled idle state. Thus, In order to complete the solution, boundary probabilities must be considered. Since states with x = 0 or x = N belong to the standard operation partition only, we have: and The solution to the 2M-1B RP model described in this paper can be obtained in closed form as reported in Sect. 5.

Blocking frequency, production rate and conservation of flow
Since the aim of the restart policy is to reduce the stoppages of the first machine due to blocking events, the blocking frequency f b is a fundamental performance measure.
Recall that the first machine gets blocked when the system reaches state (0, N, 1, 0), i.e. when the buffer is full, the second machine is down and the first machine, even if still operational, cannot release material. Therefore, the blocking frequency f b can be determined as the probability of entering (or, equally, of exiting) that state. So, r 2 p(0, N, 1, 0), where, for the sake of simplicity, the blocking frequency is expressed as the probability of exiting state (0, N, 1, 0). Another important performance measure is the line production rate. The production rate of each machine i (with i = 1, 2), i.e., the rate at which material leaves the machine, is equal to its nominal capacity multiplied by its efficiency E i . Specifically, the speed at which machine i can operate is μ i if machine i is not limited by the other one (e.g., if μ 1 < μ 2 , when the buffer is empty and the first machine is operational, the second machine cannot be faster than the first one).
Consequently, considering non-zero probabilities only, we have for the first machine, and for the second one. Note that if μ 1 > μ 2 the term p(0, 0, 1, 1) is equal to zero. Since the first machine is in the controlled idle state during the buffer drainage partition, states belonging to this partition influences the production rate of the second machine only.
For the system to be in the steady state, the following equation (conservation of flow equation) must be verified: The proof is reported in Appendix B.

Solution technique
In this section the solution technique adopted to solve the two partitions in isolation is explained in detailed. Given the isolated solutions f S (x, α 1 , α 2 ) and f D (x, α 2 ), the partition probabilities can be easily determined according to Eqs. (35) and (36) and, finally, the system solution according to Eq. (37).

Standard operation partition in isolation
It is natural to assume the following exponential form for the solution to the steady state density equations of the standard operation partition in isolation: By substituting (45) in the internal equations belonging to this partition (Eqs. (93)-(96) in Appendix A) we obtain the three following parametric equations: If μ 1 = μ 2 , Eqs. (46)-(48) can be reduced to a single quadratic equation in Y 1 : (49) has the two following solutions: By substituting (50) and (51) in (46) and (47) it is possible to find out the expression for the remaining parameters Y 21 , Y 22 , λ 1 , λ 2 .
Moreover, another feasible solution is the following: Consequently, the solution for internal states can be expressed as where, from Eq. (54), the third component of the solution is constant. If μ 1 = μ 2 = μ, Eq. (49) reduces to a linear equation whose solution is From the parametric equations (47) and (48) we obtain: It is convenient to treat separately the three cases related to μ 1 = μ 2 , μ 1 < μ 2 and μ 1 > μ 2 .
Case μ 1 < μ 2 The boundary conditions yield It is possible to express the constants C S 2 and C S 3 in term of C S 1 by Eqs. (17) and (18). Specifically, and Therefore, the only unknown parameter in the standard operation partition model is C S 1 . The value of C S 1 can be found by means of the normalization equation for the standard operation partition in isolation, i.e. Eq. (15). Note that the Basic model without restart policy presents only two constants while C S 3 = 0 in the RP model as a consequence of the restart policy.
Case μ 1 > μ 2 The boundary probabilities are the following: The constants C S 2 and C S 3 can be expressed in terms of C S 1 by Eq. (102), reported in Appendix A, evaluated at the steady state, and by Eq. (21).

Thus,
and Even in this case, the only unknown parameter is C S 1 whose value can be obtain by the normalization equation for the standard operation partition in isolation (Eq. (15)).
Consequently, the solution for internal states can be simply expressed as where, for the sake of clarity, the same notation used for the previous cases has been maintained.
The boundary conditions yield Recall that in the Basic model without restart policy the constant C S 3 in Eq. (79) is zero. On the contrary, here the relation between C S 1 and C S 3 can be found by considering Eq. (17) leading to the following: The normalization equation (15) allows us to find the value of C S 1 and complete the solution.

Buffer drainage partition in isolation
As regards the buffer drainage partition in isolation, a solution satisfying Eqs. (23) and (24) is the following: The value of C D can be obtained by the normalization equation for the buffer drainage partition in isolation (Eq. (27)) as follows: so,

System solution
Once the solutions have been determined for both the partitions in isolation, we can derive the probabilities of the system being in each partition from Eqs. (35) and (36). Finally, the solution to the original system can be expressed as follows: where the constant parameters C j , Y ij , λ j with i = 1, 2 and j = 1, 2, 3 are computed according to Sect. 5.

Numerical results
In the following some interesting numerical examples are proposed where the RP model described in this paper is compared with the Basic model without restart policy (see Gershwin and Schick 1980;Gershwin 2002). In this way it is possible to discuss the benefits that may derive from the adoption of the restart policy.
In particular, scenarios with μ 1 e 1 > μ 2 e 2 and with μ 1 e 1 < μ 2 e 2 are investigated. Configurations in which μ 1 < μ 2 , μ 1 = μ 2 , and μ 1 > μ 2 are evaluated as well. Table 2 reports input data for Example 1a, where the isolated productivity of the first machine μ 1 e 1 is greater than that of the second machine μ 2 e 2 , while μ 1 ≤ μ 2 . The most important analytical results (i.e., line production rate P , blocking frequency f b , starvation probability ps and blocking probability pb) for each scenario of Example 1a are reported in Table 3 by varying the buffer size N . Figure 2 depicts the blocking frequency for each scenario of Example 1a. Light lines in Fig. 2 represent the RP model and bold lines represent the Basic model. Same line type means same input data. As can be easily seen, when the restart policy is not adopted, the blocking frequency approaches a limit greater than zero as the buffer capacity increases. Thus, when μ 1 e 1 > μ 2 e 2 , -if the restart policy is not adopted, there is a nonzero probability of the buffer filling up even if large buffer capacities are involved;   -the introduction of the restart policy makes it possible to significantly reduce the blocking frequency, allowing it to tend to zero when the buffer capacity is large enough.
Therefore, if μ 1 e 1 > μ 2 e 2 and the outage costs on the first machine are critical (so that the blocking frequency results to be a key performance measure that should be taken as low as possible), the adoption of the restart policy results to be convenient. This situation occur, e.g., in automated packaging lines of the food and beverage sector.
The same result is obtained in Fig. 3 showing the blocking frequency for each scenario of Example 1b (refer to Table 4 for input data). While μ 1 e 1 is still greater than μ 2 e 2 , now μ 1 ≥ μ 2 . The most important analytical results (i.e., line production rate P , blocking frequency f b , starvation probability ps and blocking probability pb) are reported in Table 5.
Finally, Table 6 reports input data for Example 2, representing scenarios with μ 1 e 1 < μ 2 e 2 . The analytical results in this case are reported in Table 7. Figure 4 shows the blocking frequency for both scenarios. As in previous figures, light lines represent the RP model and bold lines the Basic model, and same line type means same input data. In such situations, the blocking frequency naturally approaches the limit zero as the buffer capacity increases. This is true for both models, with or without restart policy. What can be noted is that the restart policy affects the rapidity of the convergence. This is especially evident for the case  Table 6 and the dashed line in Fig. 4) where the restart policy makes it possible to reduce the blocking frequency even when small buffers are adopted.
Thus, if μ 1 e 1 < μ 2 e 2 it is possible to reduce the blocking frequency by either using a large buffer or adopting the restart policy.
Moreover, it is important to consider that even if the restart policy has a beneficial effect on the blocking frequency, the probability of starvation is expected to increase. Figure 5 shows the starvation probability for Example 2. In can be seen that, especially if the buffer is small, the adoption of the restart policy implies a higher starvation probability. Thus, it may be convenient only if outage costs on the first machine are significant.
In general, the decision whether or not to adopt the restart policy would be based on a carefully considered evaluation of other constraints that are out of the scope of this work (e.g., buffer capacity constraints and costs). This is especially true if μ 1 e 1 < μ 2 e 2 .

Conclusions
The work addresses the performance estimation of a 2M-1B production line in which a control policy is adopted to control the machines' behavior according to a specific event happening in the line, i.e. the buffer filling up. Since the introduction of such a control policy increases the complexity of the problem, a modeling approach based on the partitioning of the state space has been developed so as to facilitate mathematical tractability.
The production line under study consists of two machines decoupled with a finite buffer where a restart control policy (RP) is introduced on the first machine. The aim is to prevent the very machine from producing parts each time the buffer gets full until it becomes empty again. This policy is frequently adopted in industrial installations where outage costs (e.g., production of a certain amount of waste) are generated during the restart phase of the machines.    The RP model is developed as a continuous time Markov process so as to allow the consideration of machines having different capacities. The exact analytical solution of the model is provided and the conservation of flow is proved. Moreover, the expression of the most important performance measures is derived.
Numerical examples prove the ability of the RP model to represent the effects of the adopted restart policy on the blocking frequency (and, as a consequence, on the line efficiency), as a function of the buffer capacity and the machines' parameters. The resulting model represents an important tool able to point out the convenience of adopting a restart policy in a production line and to measure its effects as a function of the line characteristics.
Appendix B: Conservation of flow: proof For the system to be in the steady state, the following equation (conservation of flow equation) must be verified: Proof For the sake of clarity, the expression of the total system solution can be split into the components related to the two partitions as follows: P 2 = π S μ 2 N 0 f S (x, 0, 1) + f S (x, 1, 1) dx + μ 1 p S (0, 1, 1) By adding the steady state versions of the internal differential equations (93)-(96) in Appendix A, we obtain d dx (μ 2 − μ 1 )f S (x, 1, 1) + μ 2 f S (x, 0, 1) − μf S (x, 1, 0) = 0.