Queuing network models are used to analyze performance of computer systems. Here are some examples to introduce queuing (or queueing) networks.
EXAMPLE 1
OTP, Inc. is producing a simple online transaction processing system with two disk files. Fixed-length account records are read and updated using a computed offset into the account file, and a transaction log is written by appending to the log file. The marketing and sales departments want a system that supports 80 user terminals and allows operators to complete transactions in average time of 5 seconds or less.
Research shows an average transaction takes about 100 milliseconds (ms.) of MYTHICAL CPU time (Although MYTHICAL can multitask copies of the same program, the first example uses one program). Three disk I/O's average 50 ms. each. Operators are able to key and send transactions in 5 seconds.
Factors such as I/O buffering, data transfer rates, and datacomm line speed are ignored, to keep this simple.
CURRENT OPERATION THINK TIME CPU TIME I/O TIME
Operator think and send 5,000 0 0
CPU process message 0 35 0
Read account 0 0 50
Process transaction 0 30 0
Write log to disk 0 0 50
Complete processing 0 10 0
Rewrite account to disk 0 0 50
Send response to operator 0 25 0
_____ ____ ____
TOTALS: 5,000 ms. 100 ms. 150 ms.
Some questions about the "one operator" scenario:
- How long to process one transaction?
_ Operator think time 5,000 ms. + CPU 100 ms. + disk I/O 150 ms. = 5.25 seconds.
- How many transactions per minute?
_ 60 seconds / 5.25 = 11.4 transactions / minute
- What do the CPU and disk do while the operator thinks and keys?
_ They are idle, available for other processes.
EXAMPLE 2
A queuing network is a system of one or more service queues, so let's look at a queue.
After observing a service queue for an amount of time (T), a number of jobs arrive (A), the servicing device is busy for some time (B), and a number of jobs are completed (C), and leave.
*********
---------* *
Waiting * Job *
Job ----> queue * service * ----> Job Completions
Arrivals * device * (C)
(A) ---------* *
*********
T = Observation Time B = Device Busy time
Formulas for Arrival rate, Completion rate, Utilization, and Mean Service time, Table assumes Arrivals equal Completions:
Basic formulas:
Arrival rate: a = A/T, Arrivals / Observation Time
Completion rate: X = C/T, Completions / Observation Time
Utilization: U = B/T, Busy Time / Observation Time
Mean Service time: S = B/C, Busy Time / Completions
Apply formulas to Example 1, as a system:
T = 1 minute, B = 1 minute, A = C = 11.4 per minute
Arrival rate: a = 11.4/1, 11.4 per minute
Completion rate: X = 11.4/1, 11.4 per minute
Utilization: U = 1/1, 100 percent
Mean Service time: S = 1/11.4 = .0877 minute = 5.263 seconds
The last three formulas are equivalent to the following useful formula for "peak rate":
System U = X * S
(Because U = B/T = C/T * B/C)
For peak theoretical transaction rate, set Utilization to 1 (100%), and plug in the mean service time:
1 = X * 11.4
X = 1 / 11.4 = peak rate .0877 per minute (1 every 5.263 seconds)
So, U = 11.4 * .0877, or ~ 100 percent
Note: The Operator, CPU and Disk are also queues. So, Utilization
for CPU, Disk and Operator fit to U = Busy Time / Observation Time:
U of CPU = .0100 / 1 = 1 percent
U of Disk = .0150 / 1 = 1.5 percent
U Operator = .95 / 1 = 95 percent
EXAMPLE 3
Suppose we add more terminal operators, to capacity. Given enough operators, think time is less relevant.
- How long to process one transaction?
_ CPU 100 ms. + disk I/O 150 ms. = 250 ms. to process one transaction.
- How many transactions per minute?
_ 60 seconds / 0.250 seconds = 240 transactions / minute
- What does the CPU do while the disk is busy, and vice versa?
_ There is some overlap, but CPU and disk spend time waiting for queued activity to clear.
- What amounts of CPU and disk are utilized?
_ CPU 100 ms. out of 250 ms. = 0.4 or 40 percent use. Disk I/O 150 ms. of 250 ms. = 0.6 or 60 percent.
Since the system is using only 40 percent of the CPU and 60 percent of the disk, we should be able to process more transactions per minute by running two or more copies of the program. Running multiple program copies may require extra effort recovering from abort conditions, which many database systems handle.
Running multiple program copies should boost the disk usage closer to 100%, increasing throughput. Then, adding another disk can spread the I/O and help with the disk bottleneck, again, increasing throughput. With the data already gathered, queuing network models can analyze the results of adding another program copy, and adding another disk.
SUMMARY OF DATA FOR MODELING ONE DISK WITH MULTIPLE PROGRAM COPIES
NUMBER OF ROUTING PROBABILITY MEAN SERVICE
DEVICE VISITS TO CPU, TO DISK, LEAVE SYSTEM TIME, SECONDS
_________________________________________________________________________
CPU 4 0 0.75 .25 0.025
DISK 3 1 0 0 0.050
The following model takes into account routing probabilities.
ONE DISK WITH MULTIPLE PROGRAM COPIES
Transactions ********* Transactions
in -----* * 25 percent of CPU vists out
------------> * * ----------------->---------------------------->
* CPU *
----> * * ----. 75 percent of Service time
/ -----* * \ CPU visits .050 Sec.
| ********* \ *********
^ Mean service \ -----* *
| time .025 sec. \ * *
| `-----> * DISK * ---->.
| * * |
^ -----* * |
| ********* |
| 100 percent of disk visits V
`<-------------------------<------------------------<--------
A transaction requires 4 CPU visits and 3 disk visits. Number of visits times mean service shows the disk is used more, and is a bottleneck:
CPU transaction time: 4 jobs * 0.025 = 0.1
Disk transaction time: 3 jobs * 0.050 = 0.150
Use of a device (U) = Completion rate (X) * Mean service time (S). Try setting disk use (U) to 100%:
1 (U) = I/O completion rate (X) * 0.050 seconds (S)
I/O completion rate - 1 / 0.050 seconds = 20 per second
Twenty per second is the peak disk job completion rate (not the system transaction rate). With the disk use (U) set to 100%, let's work back to the CPU utilization. Noting visits to disk must equal 75% of visits to CPU:
20 disk jobs/seconds = 0.75 * CPU jobs/second
CPU jobs/seconds = 20 jobs/second divided by 0.75 = 26.6 jobs /second
CPU use = CPU completion rate * CPU mean service time
(U = X * S)
CPU use = 26.6/second * 0.025 seconds = 0.66 = 66 percent
Jobs leaving the system must equal 25 percent of jobs leaving the CPU:
System jobs/second = 0.025 * CPU jobs/second
System transaction rate = 0.25 * 26.6/second = 6.6 transactions/second
EXAMPLE 4
Earlier, a model was shown where one disk, with one program copy, used 40% of CPU and 60% of disk, supporting up to 21 operators. With one disk and multiple program copies, disk use gets close to 100% and CPU use increases to around 66 percent. How many terminals will the multiple program scenario support? A closed-loop terminal capacity model is shown next:
TERMINAL CAPACITY MODEL
Terminal one *******
.<------------* *<------------.
/ ******* \
/ \
/ ******* \
/<----------------* *<----------------\
.<-----< ******* |<----.
| \ . / |
| \ THE . OPERATORS / |
| \ . / |
| \ Terminal N ******* / |
| .<------------* *<------------' |
| ******* |
| N = number of terminals, |
| to be determined |
| |
| ********* |
| -----* * |
`----------------> * System * -------------------->'
-----* *
*********
N = Number of terminals to be determined
R = Mean system response time, estimated five seconds
Z = Mean think time
X = System throughput rate, 6.6 transactions/second, one disk, multi-copy
Assuming job arrivals equal job completions, response time R = (N / X) - Z. Also, number of terminals N = (R + Z) * X, and X = N / (R + Z).
Terminal capacity, N = (R + Z) * X
= (5 + 5) * 6.6, or 10 * 6.6 = 66 terminals
EXAMPLE 5
The 66 terminal limit does not meet goals, and the system is limited by maximum use of the single disk drive. Adding a disk drive should increase system throughput. The account file is placed on Disk A, the log file on Disk B. Visit ratios are recomputed for two disks. From CPU, two of four visits go to disk A, one of four to Disk B, and one leaves the system.
TWO DISK MODEL
Transactions
Transactions ********* 25 percent of CPU vists out
in -----* * ----------------->----------------------------->
------------> * *
* CPU * ------.
.---> * * \ 50 percent of
/ -----* * --- . \ CPU visits
| ********* \ \ *********
^ Mean service \ \ -----* *
| time .025 sec. \ \ * DISK *
| \ `-------> * A * ---->.
| \ * accts. * |
^ \ -----* * |
| \ ********* |
| \ |
| 25 percent of \ ********* |
| CPU visits \ -----* * |
| \ * DISK * |
| `---> * B * |
| * log * |
| -----* * |
| ********* |
| |
| 100 percent of disk visits V
`<-------------------------<------------------------<--------------'
Disk A would appear to be the bottleneck here. Working back from disk A try setting its' utilization to 100 percent and apply formulas, taking into account mean service time at disk A:
1 = X * 0.050, X = 20 per second at A
(U = X * S)
Arrivals equaling completions, disk A throughput must equal 50 percent of CPU throughput:
20/second = 0.50 * CPU throughput
CPU throughput = (20/second) / 0.050 - 40 per second at CPU
Disk A use = (40/second * 0.025) - 1, max 100%, theoretically
System throughput must equal 25 percent of CPU throughput:
System throughput = (0.25 * 40) / second = 10 transactions/second
Next, determine the terminal capacity of the two-disk system:
N = Number of terminals to be determined
R = Mean system response time, estimated five seconds
Z = Mean think time
X = System throughput rate, 6.6 transactions/second, one disk, multi-copy
N = (R + Z) * X
N = (5 + 5) * 10, = 10 * 10 = 100 terminals
SUMMARY OF RESULTS
TRANSACTION UTILIZATION TERMINALS
CONFIGURATION (All single CPU) RATE LIMIT CPU DISK SUPPORTED
One program, one operator. one disk 0.19/second 0.01 0.015 1
11.4/minute
One program, multi-operator, one disk 4/second = 0.4 0.6 40
240/minute
Multi-program, multi-operator, one disk 6.6/second = 0.66 1.0 66
396/minute
Multi-program, multi-operator, two disks 10/second = 1.0 A 1.0 100
600/minute B 0.5
EXERCISE
The new LEGEND CPU is twice as fast as the MYTHICAL. Disks for the LEGEND are improved to provide 40 ms. mean access time. Three disks are used as follows: Log file assigned to one disk, even numbered accounts are on a second disk, odd numbered accounts are on a third disk. Develop models to predict transaction rate, utilizations, and terminals supported.
CAUTIONS AND NOTES
The material presented is simple and theoretical. Datacomm line times, varieties of file organization, disk transfer rates, and other factors were ignored. The models can help predict theoretical limits, that can't be met, but which can't be exceeded. For example, the models given show hypothetical 100% use of devices, which does not happen in real life.
The formulas require determining "correct comparable units of measurement"; Numbers must be converted to comparable units. The models generally require "job flow balance": job arrivals equal job completions for all devices and the system.
Where feasible, use measurements on real systems to check model assumptions.
Two criticisms of computer network queuing models: 1) I/O overlap is not adequately addressed, and 2) using mean service times ignores the fact that service times are not homogeneous. Nevertheless, the models can provide useful information.
These ideas are addressed in the articles mentioned next. Finally, queuing network modeling software is available from schools and commercial sources, to simplify the modeling task.
BIBLIOGRAPHY
BUZEN, J.P., and DENNING, P.J.
"The Operational Analysis of Queuing Network Models"
ACM Computing Surveys, Vol. 10, No. 3 (Sep. 1978, pp. 225-261)
ANDERSON, G.E.
"The Coordinated use of Five Performance Evaluation Methodologies"
Communications of the ACM, Vol 27, No. 2 (Feb. 1984, pp. 119-125)