COMP9334
1
COMP9334
Capacity Planning for Computer Systems
and Networks
Week7: Discreteeventsimulation(2)
S1, 2018
COMP9334
2
Last week
• Two topics
• How to perform. discrete event simulation of queues
• How to general random numbers of a given probability distribution
by using inverse transform. method
• You should be able to simulate
• Many types of queues
• Single-server or multi-server
• Different queueing disciplines
• Many inter-arrival time and service time distributions
• However, there are a number of problems …
S1, 2018
COMP9334
3
Problem: data interpretation, simulation length
• Figure from
• Week 6’s
revision problem
#1
S1, 2018
COMP9334
4
Problem: How do we compare 2 alternative choices?
• Week 6’s
Revision
Problem, #2
• From Queueing
theory, we
expect the
M/M/1 system to
be better but
simulation
doesn’t seem to
suggest that?
S1, 2018
COMP9334
5
Analysis of simulation results
• A very important topic but it is very often ignored
• Simulation is notsimply about
• Writing a computer program
• Verifying the correctness of the program
• Running the simulation once and present the results
• Verifying the correctness of the simulation program is
important
• It is equally important to do sound statistical analysis on the
simulation results obtained
S1, 2018
COMP9334
6
This lecture
• Analysis of simulation results
• How to choose simulation parameters?
• How long should I simulate for?
• How many times should I repeat the simulation?
• How do I compare two alternative systems?
• Variance reduction technique
• Simulation tools
S1, 2018
COMP9334
7
Analysis of simulation data
• There are many statistical methods to analyse data
depending on the situation
• We will focus on analysing steady state mean valueonly
• For example, we are interested to find the steady state
mean response timeof a queue
• Recall that we talked about
• Transient and steady state behaviour of queue in Week 3
• Steady state of Markov chain in Week 4
S1, 2018
COMP9334
8
What is steady state? (1)
• Let us simulate an M/M/1 queue with
• Arrival rate l= 0.7
• Service rate µ= 1
• Simulation ends when master clock is 50000s
• In this simulation, we record the response time for each job
• Let X(k) = Response time of k
th
job
• The next page shows X(k) changes continuously
• Let N denote the number of jobs in the simulation
• N = 35000 for our simulation
• Last week, we computed the mean response time using
S1, 2018
COMP9334
9
Response time continuously changes over time
• This graph
shows response
time of X(k) of
the k-thjob
where k = 1 to
35000
• Note response
time continuously
varies
• Response time
does notsettle to
a constant value
• But mean
response time
doessettle
S1, 2018
COMP9334
10
What is steady state? (2)
• Let us instead compute the running mean M(k) where
• For example, if k = 5, then
• Thus M(5) is the mean response time of the first 5 jobs
• In general, M(k) is the mean response time of the first kjobs
• Let us plot M(k) -see the next slide
S1, 2018
COMP9334
11
Transient behaviour versus steady state behaviour
jo
b
s
S1, 2018
COMP9334
12
Transient removal: Introduction (1)
• The early part of the simulation displays transient (= non-
steady state behaviour)
• The later part of the simulation converges or fluctuates
around the steady state value
• Since we are interested in the steady state value, we
should notuse the transient part of the data to compute the
steady state value
• We should remove the transient part and only use the
steady state part to compute the mean
• One method to identify the transient part is to use visual
inspection
• Note: In the previous slide, we have the theoretical value to guide
us but in practice you don’t, you will learn a transient removal
method based on batch means in your Revision Problem this week
S1, 2018
COMP9334
13
Transient removal: Introduction (2)
• Let us assume that the first mjobs constitute the transient
part and there are Njobs altogether, we should revise the
formula to compute the mean to
• Note: We used too simple a method to compute the mean last
week but I didn’t want to complicate things at that time!
• Important: You must run the simulation long enough so that
you have a good number of data points (or jobs) in the
steady state part.
• In order to remove transient properly, you need to understand replications
S1, 2018
COMP9334
14
Independent replications
• Assume that we carry out simulations to find out what the
steady state mean response time of a queueing system is
• Important note: We cannot get exact answer from simulation
• We express our simulation results as e.g. there is a probability of
95% that the mean response time is in the interval [3.1,3.3].
• We call the interval [3.1,3.3] the 95% confidence interval.
• Independent replications: Repeat the simulation a number
of times using differentsets of random numbers
• Why independent replications?
• Independent replications allow us to use statistical method to
estimate a confidence interval of steady state mean response time
S1, 2018
COMP9334
15
Example: Independent replications
• We want to use simulation to estimate the mean response
time of an M/M/1 queue with
• Arrival rate l= 0.7
• Service rate µ= 1
• Simulation ends when master clock is 16000s
• We repeat the experiment 30 times using different sets of
random numbers
• For each independent experiments
• We record the response time of all the jobs
• Remove the transient part
• Compute the mean response time using the steady state section
• We obtain 30 different estimates of the mean response
time, one from each independent experiment
• These independent estimates allow us to find a confidence
interval
S1, 2018
COMP9334
16
Example (Cont’d)
• The blue circles show the
estimated mean response time
from the 30 independent
experiments
• The red line is the 95%
confidence interval
• There is a 95% probability that
the true mean response time that
we want to estimate is in the
interval [3.30,3.62]
• The green line is the
theoretical mean response
time (which you should not
normally know).
S1, 2018
COMP9334
17
Computing the confidence interval (1)
• Assume that you do nindependent replications
• In each replication, you remove the transient part and
compute an estimate of the mean steady state response
time
• Let us call your estimate from the k
th
replication, T(k)
• Compute the sample mean
• And the sample standard deviation
Note: for sample standard
deviation, (n-1)is in the
denominator, not n.
S1, 2018
COMP9334
18
Computing confidence interval (2)
• There is a probability (1-a) that the mean response time
that you want to estimate lies in the interval
S1, 2018
COMP9334
19
Computing confidence interval (3)
• The value can be obtained from looking up
the Student t distribution table
• Note: A Student t table has been provided on the web site
• There are also programs that compute it
• In Matlab, you can use tinv(1-alpha/2,n-1)
S1, 2018
COMP9334
20
Example: Independent replications (cont’d)
• From the example on p.15
• The sample mean of (n = ) 30 replications = 3.47
• The sample standard deviation of 30 replications is 0.43
• If we want to compute the 95% confidence interval, a= 0.05
• Since we did 30 independent experiments and want 95%
confidence interval, we use
• From the t-distribution table, the value of is
2.0452, the 95% confidence interval is
S1, 2018
COMP9334
21
More on confidence interval
• Confidence interval
a% confidence interval
mid-point
S1, 2018
COMP9334
22
What can we get from simulation?
• If your queueing problem has a mathematical solution, you
will get onevalue for the steady state mean response time
• If you simulate a queue to try to estimate the mean
response time, you will notknow the exact value of the
steady state mean response time
• Simulation can only give you a confidence interval of what
you want to estimate
• You can reduce the confidence interval by doing many
independent replications!
S1, 2018
COMP9334
23
Choice of simulation parameters (1)
• Simulation parameters
• Length of simulation
• Number of replications
• Accuracy
• Unfortunately, there are nohard rules to choose them. You
will need to do some trial and error
• If the length of simulation is not long enough, you will need to
increase it
• If the number of replications is not enough to give you the desired
accuracy, you will need to increase it
S1, 2018
COMP9334
24
Choice of simulation parameters (2)
• Length of simulation
• Must be longer than the transient
• Should have a good number of data point in the steady state part
• Hard to say what “good” is. Get a few hundred if you can. The more
the better but of course your simulation will run longer
• Number of replications
• You may want to have 5 replications to start with
• After removing the transient, compute the confidence interval for
your estimate.
• Compare the width of your confidence interval with your desired
accuracy. If the confidence interval that you have obtained is too
wide, you will need to increase the number of replications.
• Progressively (basically by trial-and-error), increase the number of
replications until you get the desired level of accuracy
S1, 2018
COMP9334
25
Comparing two systems: motivation
• An application of simulation is to compare two systems
• For example, in last week’s revision question, you used
simulation to compare the mean response time of
• System 1: M/M/1 queue with l= 0.9 and µ= 1
• System 2: M/M/2 queue with l= 0.9 and µ= 0.5 for both server
• If you use analytical method, you can find the steady state
mean response time of both systems exactly and you
compare two numbers
• If you use simulation, you get a confidence interval for each
system instead. How do you compare them?
S1, 2018
COMP9334
26
Example: Comparing two systems
• Let us assume our goal is to use simulation to compare:
• System 1: M/M/1 queue with l= 0.9 and µ= 1
• System 2: M/M/2 queue with l= 0.9 and µ= 0.5 for both server
• For each system we carry out 3 independent replications
• That is, we use 6 sets of independent random numbers together
• After removing the transient, the estimated mean response times are:
• System 1: 6.8769, 8.5769, 10.6340
• System 2: 8.8087, 7.4616, 9.1565
• In order to compare them, let us pair up these results
• 1st experiment for System 1 with 1st experiment for System 2
• 2nd experiment for System 1 with 2nd experiment for System 2 etc.
S1, 2018
COMP9334
27
A paired-tconfidence interval
• Let us summarise the data in a table
• EMRT = estimated mean response time
EMRT System 1 EMRT System 2 EMRT System 2 -EMRT System 1
Rep. 1
6.8769 8.8087 1.9318
Rep. 2
8.5769 7.4616 -1.1154
Rep. 3
10.6340 9.1565 -1.4775
• We compute the 100 (1-a)% confidence interval of the difference between 2
systems (= last column)
• Let us denote the computed confidence interval by [p,q]
• Case 1: p,q > 0 èSystem 1 is better than System 2 with probability (1-a)
• Case 2: p,q 0 p Mean of Sys. 2
CIs overlap and
mean of a system
is in the CI of the
other: System are
notdifferent
CIs overlap and
mean of any one is
not in the CI of the
other: do t-test
S1, 2018
COMP9334
37
Ex: Multicast protocol design for wireless mesh networks
• Comparing 3
multicast protocols
(WCMA, SPTand
RCAM) for
wireless mesh
networks
• The thin vertical
line shows the
confidence interval
• What conclusion
can you draw?
• Source: Chou et al, “Maximizing Broadcast and Multicast Traffic Load through
Link-Rate Diversity in Wireless Mesh Networks”, you can download it from
my web site: http://www.cse.unsw.edu.au/~ctchou/
S1, 2018
COMP9334
38
Simulation tools and some applications (1)
• You do not always have to write your own simulation
programs from scratch
• There are plenty of simulation tools available
• Many with GUI
• Simulation tools are used in a lot in computer networking
research
• Protocol #1 is the existing protocol, you have designed Protocol
#2. You want to see whether Protocol #2 is better or not.
• You have two options (Option #1 and Option #2) to design a
network. Which option is better?
S1, 2018
COMP9334
39
Simulation tools and some applications (2)
• Some examples of publicly available simulation tools
• General purpose: OMNet++
• http://www.omnetpp.org/
• For networking research: ns3
• http://www.isi.edu/nsnam/ns/
• Some commercial tools
• For network design: OPNET, Qualnet
• http://www.opnet.com/
• http://web.scalable-networks.com/content/qualnet
• Important note: These tools save you time in writing
simulation program but don’t forget that you still need
to analyse your simulation results using statistically
sound methods!
S1, 2018
COMP9334
40
Summary
• Simulation is not just a computer programming exercise
• You need to make sure that your program is correct
• It is also important to analyse your results statistically
• Methods discussed include
• Transient removal technique
• Confidence interval
• Determining number of replications
• Comparing 2 alternatives
• Variance reduction
• Unfortunately, a lot of published research papers in
computer networking did not do sound statistical analysis
• Optional reading: Pawlikowski et al, “On credibility of simulation
studies of telecommunication networks”, IEEE Communications
Magazine, Pages 132-139, January 2002.
S1, 2018
COMP9334
41
References
• The primary reference is Law and Kelton, “Simulation Modelling and
Analysis”
• Transient removal, Sections 9.1, 9.2 and 9.5
• Replication method, Section 9.5.2
• Comparing two alternatives, Section 10.1, 10.2 (10.2.1 only)
• Common random numbers, Section 11.2
• Raj Jain, “The Art of Computer Systems Performance Analysis” has
materials on
• Transient removal methods, Section 25.3
• Calculating confidence interval, Section 13.2
• Comparing two alternatives, Sections 13.3, 13.4 (13.4.1 and 13.4.3 only)
• Note that we have only touched on the basic of statistical analysis of
simulation data. The above two books (outside the specified sections)
will provide you with more in depth discussion on the topic.
• If you are interested to know the mathematical background on
confidence interval, student t-distribution etc., a possible reference is
Wackerly et al, “Mathematical Statistics with Applications”.