This review provides an overview of the queueing modeling issues and the related performance evaluation and optimization approaches framed in a joined manufacturing and product engineering. Such networks are represented as queueing networks. The performance of the queueing networks is evaluated using an advanced queueing network analyzer: the generalized expansion method. Secondly, different model approaches are described and optimized with regard to the key parameters in the network (e.g., buffer and server sizes, service rates, and so on).
Queueing theory is the mathematical study of waiting lines and it enables the mathematical analysis of several related processes, including arrivals at the queue, waiting in the queue, and being served by the server. The theory enables the derivation and calculation of several performance measures which can be used to evaluate the performance of the queueing system under study. More specifically, the focus in this paper is on finite buffer queueing networks which are characterized by the blocking effect, which eventually degrades the performance, commonly measured via, for example, the throughput of the network.
Queueing modeling and optimization of large scale manufacturing systems and complex production lines have been and continue to be the focus of numerous studies for decades (e.g., see Smith [
This review aims at providing an overview of modeling, performance evaluation, and optimization approaches from a queueing theory point of view. Additionally, the algorithms selected were implemented and tested in some basic queueing network topologies, namely, series, merges, and splits. The numerical results provide new insight into this important class of manufacturing network design problems.
The paper is structured as follows. In Section
Queueing networks are defined as either open, closed, or mixed. In open queueing networks, customers enter the system from outside, receive some service at one or more nodes, and then leave the system. In closed queueing networks, customers never leave or enter the system but a fixed number of customers circulate within the network (see Whitt [
The assumption is that the capacity of the buffer space between two consecutive connected service stations is finite. As a consequence, each node in the network might be affected by events at other nodes, leading to the phenomena of blocking or starvation. In the literature, two general blocking mechanisms are presented: blocking after service (BAS) and blocking before service (BBS). BAS occurs when after service a customer sees that the buffer in front of her/him is full and as a consequence s/he cannot continue her/his way in the network. BBS implied that a server can start processing a next customer only if there is a space available in the downstream buffer. If not, the customer has to wait until a space becomes available. Most production lines operate under BAS system. Moreover, in the literature it is the most commonly made assumption regarding the buffer behavior (see Dallery and Gershwin [
Performance evaluation tools for queues include
Initially, the
The
Among the
Finally,
In this paper, the Generalized Expansion Method (GEM) is used as the prime performance evaluation tool. Consequently, this paper provides a selected review based on the GEM and does not explicitly consider other methodologies to obtain the performance measures. Note that the models described fit any performance evaluation tool.
In general, we evaluate the performance of the network via its throughput
As it will be detailed soon, the GEM transforms the queueing network into an equivalent Jackson network, which can be decomposed so that each node can be solved independently of each other (similar to a product form solution approach). The GEM is an effective and robust approximation technique to measure the performance of open finite queueing networks. The effectiveness of GEM as a performance evaluation tool has been presented in many papers, including Kerbache and Smith [
The GEM uses BAS, which is prevalent in most systems including production and manufacturing (see Dallery and Gershwin [
The GEM is basically a combination of two approximation methods, namely, the repeated trials and the nodebynode decomposition. In order to evaluate the performance of a queueing network, the method first divides the network into single nodes with revised service and arrival parameters. Blocked customers are registered into an artificial “holding node” and are repeatedly sent to this node until they are serviced. The addition of the holding node
In the remaining part of this section, we will present a highlevel overview of the method. For more detailed information and applications of the GEM, the reader is referred to, for example, Kerbache and Smith [
There are three main steps in the GEM, namely,
The generalized expansion method.
There are two possible states of the finite node, namely, the
The
Consider
As a consequence, the service rate at node
The above equations apply to all finite nodes in the queueing network.
Note that (
In this section, we review some of the optimization models found in the literature. Given a directed graph
In general, we can write the generic optimization model as follows:
A number of specific models can be specified based on the above generic model.
When we set
The server allocation problem (CAP) appears if we have
Combining the server and buffer allocation problems by setting
We consider two options to rewrite the objective function depending on how to deal with the multiobjective issue.
First, the objective function can be written as a weighted sum of the two objectives; that is,
We assign a cost of
Secondly, the objective function can be formulated in a multicriteria way; that is,
in which each one of the two objectives are considered explicitly, with
Consequently, one obtains an approximation of the Pareto set of solutions for the two objectives. As such, this perspective is more general than the BCAP1 formulation. For further details on multiobjective optimization in general, see Chankong and Haimes [
A slightly different formulation is the optimal routing problem (OROP). Here, the routing probabilities
subject to
in which
The throughput will thus be affected by the effective routing of jobs through the network, the variability of the service distribution, the number of servers, and the number of buffers. Among the papers that dealt with the ORAP, we could mention Gosavi and Smith [
A last variation considered is the profit maximization model. The models are thus expanded with financial indicators in order to maximize the profit generated. This profit will be a function of the quantity one can set in the market (i.e., the throughput) and the costs to realize this throughput, which could be the buffer and/or server allocation. The decision variable is thus the investment in buffers or servers (
in which
Achieved profit at the optimal buffer allocation for
contour plot
It is worthwhile to state that the models described above are difficult nonlinear integer programming problems. Considering the BCAP model, it can be shown that for a network with
Clearly, the solution space grows exponentially in the number of nodes, but not (exponentially) in the capacity of each node. The complexity of the BCAP model can thus be written as
While the GEM computes the performance measures for the queueing network, many of the above discussed models need to be optimized on the decision variables defined in
Powell’s [
Genetic algorithms (GA) are optimization algorithms to perform an approximate global search relaying on the information obtained from the evaluation of several points in the search space and obtaining a population of these points that converges to the optimum through the application of the genetic operators
In this section, we will focus on one example network and describe the results for some of the different optimization models discussed above. We consider a combination of the three basic topologies (series, split, and merge), as shown in Figure
Combined topology.
We reproduce in Table
Results for the buffer allocation problem (see Smith and Cruz [







0.5  (1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1)  (8 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5)  16  69  4.9899 
1.0  (1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1)  (10 5 5 5 5 4 4 4 4 4 4 4 4 5 5 5)  16  77  4.9879 
1.5  (1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1)  (11 5 5 5 5 5 5 5 5 5 5 5 5 5 5 6)  16  87  4.9877 
Let us now fix the number of buffers beforehand and then optimize on the number of servers used. More specifically, we set all buffers equal to 1 and look at the resulting server allocation. The results are given in Table
Results for the server allocation problem.








0.5  (5 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2)  (1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1)  34  16  4.9997  35.29 
1.0  (5 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2)  (1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1)  36  16  4.9996  35.33 
1.5  (5 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2)  (1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1)  34  16  4.9996  35.37 
Before going to the results for the example network, we analyze the difference between buffers and servers. We saw that the BAP and CAP give different results in terms of number of servers versus number of buffers used. Let us assume that we have a
Throughput increase versus added number of servers (
percentual increase in
percentual increase in
It is clear that in this case, the first added buffer or first added server gives the largest contribution to the throughput value, which is limited by the external arrival rate
The results for the joint bufferserver allocation problem are presented in Table
Results for the joint bufferserver allocation problem.











5.0  0.5  1 : 8  (3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3)  (3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3)  48  48  0  4.9996  5.76 
1 : 4  (3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3)  (3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3)  48  48  0  4.9996  10.0  
1 : 2  (3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3)  (3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3)  48  48  0  4.9996  16.4  
1 : 1  (5 3 3 3 3 2 2 2 2 2 2 2 2 3 3 5)  (5 3 3 3 3 2 2 2 2 2 2 2 2 3 3 5)  44  44  0  4.9998  22.2  
2 : 1  (2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2)  (5 3 3 3 3 2 2 2 2 2 2 2 2 3 3 5)  32  44  12  4.9989  26.5  
4 : 1  (3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3)  (3 5 5 5 5 3 3 3 3 3 3 3 3 5 5 3)  20  60  40  4.9974  26.6  
8 : 1  (1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1)  (11 6 6 6 6 4 4 4 4 4 4 4 4 6 6 11)  16  90  74  4.9994  23.0  
1.0  1 : 8  (3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3)  (3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3)  48  48  0  4.9994  5.94  
1 : 4  (5 3 3 3 3 2 2 2 2 2 2 2 2 3 3 5)  (5 3 3 3 3 2 2 2 2 2 2 2 2 3 3 5)  44  44  0  4.9997  9.09  
1 : 2  (5 3 3 3 3 2 2 2 2 2 2 2 2 3 3 5)  (5 3 3 3 3 2 2 2 2 2 2 2 2 3 3 5)  44  44  0  4.9997  15.0  
1 : 1  (5 3 3 3 3 2 2 2 2 2 2 2 2 3 3 5)  (5 3 3 3 3 2 2 2 2 2 2 2 2 3 3 5)  44  44  0  4.9997  22.3  
2 : 1  (3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3)  (3 3 3 3 3 2 2 2 2 2 2 2 2 3 3 3)  34  40  6  4.9984  26.2  
4 : 1  (2 2 2 2 2 1 1 1 1 1 1 1 1 2 2 3)  (6 3 3 3 3 4 4 4 4 4 4 4 4 3 3 4)  25  60  35  4.9989  28.1  
8 : 1  (1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1)  (13 6 6 6 6 4 4 4 4 4 4 4 4 6 6 13)  16  94  78  4.9987  24.1  
1.5  1 : 8  (5 3 3 3 3 2 2 2 2 2 2 2 2 3 3 5)  (5 3 3 3 3 2 2 2 2 2 2 2 2 3 3 5)  44  44  0  4.9996  5.24  
1 : 4  (5 3 3 3 3 2 2 2 2 2 2 2 2 3 3 5)  (5 3 3 3 3 2 2 2 2 2 2 2 2 3 3 5)  44  44  0  4.9996  9.15  
1 : 2  (5 3 3 3 3 2 2 2 2 2 2 2 2 3 3 5)  (5 3 3 3 3 2 2 2 2 2 2 2 2 3 3 5)  44  44  0  4.9996  15.0  
1 : 1  (5 3 3 3 3 2 2 2 2 2 2 2 2 3 3 5)  (5 3 3 3 3 2 2 2 2 2 2 2 2 3 3 5)  44  44  0  4.9996  22.4  
2 : 1  (3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3)  (3 3 3 3 3 2 2 2 2 2 2 2 2 3 3 3)  34  40  6  4.9979  26.8  
4 : 1  (2 2 2 2 2 1 1 1 1 1 1 1 1 2 2 3)  (6 3 3 3 3 4 4 4 4 4 4 4 4 3 3 4)  25  60  35  4.9983  28.7  
8 : 1  (1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1)  (15 7 7 7 7 4 4 4 4 4 4 4 4 7 7 15)  16  104  88  4.9986  25.4 
We observe that (near) zerobuffer configurations are identified where appropriate; that is, where the servers are relatively cheaper compared to buffers. Varying the squared coefficient of variation of the service time distribution
The above numerical results for the buffer allocation problem, the server allocation problem, and the joint bufferserver allocation problem show that significant gains can be achieved in manufacturing systems. Specifically, setting the buffers and servers in an appropriate way greatly affects the throughput for these manufacturing systems. This is important as these systems need to be as highly utilized as possible, given the high investments. Our models and optimizations show that the optimal configurations are not always straightforward and thus advanced models and solution methods are needed. We have followed a queueing network approach with finite buffers, as this resembles reality the closest. This modeling approach is of course much harder than, for example, infinite queueing networks. We see based on the various experiments that our solution methodology is powerful and suitable for the different types of models handled in this paper. This offers managers and manufacturing systems designers a powerful tool to work with.
We saw that the BAP and CAP obviously give different results. We also note that while the addition of the first extra server gives a certain amount of increase in the throughput, the addition of the first buffer space generally will give a lower increase. In other words, in order to achieve the same increase in throughput by only using buffers, we need more extra buffer spaces rather than only a few server space.
This review provided an overview of the different modeling issues, the performance evaluation, and optimization approaches of the manufacturing systems assuming a queueing theory approach. We discussed the merits of the Generalized Expansion Method as a performance evaluation tool of the finite queueing networks. This methodology has proved in the literature to be a valuable approach. Secondly, different optimization models are discussed, namely the buffer allocation problem, the server allocation problem, the joint buffer, server allocation problem, and some other models. The different optimization models are shown to be hard nonlinear integer programming problems which are able to be approximately solved with a Powell heuristic. The paper ended with an overview of some results for the different models considered on a complex queueing network.
In this paper, we considered the throughput as the main performance measure. Instead of the throughput, it would be interesting to evaluate the behavior of the models based on cycle time, workinprocess (WIP), or other performance measures.
In a number of industrial improvement projects carried out, we observed that the critical issue to be able to use the above models is related to data availability. More specifically, processing rates, arrival rates, uncertainty in the service process, and so on need to be extracted from the available databases. An interesting approach to obtaining the relevant data is the effective process time (EPT) point of view (see Hopp and Spearman [
Topics for future research on the queueing part include the analysis and optimization of networks with cycles, for example, to model many important industrial systems that have loops, such as systems with captive pallets and fixtures or reverse streams of products due to rework, or even the extension to
External Poisson arrival rate to the network
Poisson arrival rate to node
Effective arrival rate to node
Exponential mean service rate at finite node
Effective service rate at finite node
Blocking probability of finite queue
Feedback blocking probability in the expansion method
The artificial holding node (queue) preceding node
Number of servers at node
Total capacity at node
Buffer capacity at node
Set of nodes (queues) in the queueing network
Set of arcs (pairs of nodes) in the queueing network
Traffic intensity at node
Mean throughput rate at node
Squared coefficient of variation of the service time distribution at node
The authors declare that there is no conflict of interests regarding the publication of this paper.
The research of professor Frederico Cruz has been partially funded by the Brazilian agencies, CNPq (