CS457 - System Performance Evaluation - Winter 2010
Public Service Announcements
- Mid-term conflicts
- Assignment 2
Lecture 14 - Data Analysis III
Analysis of Variance (ANOVA) aka Linear Models (pdf)
Zero Factor Analysis of Variance
One Factor Analysis of Variance
Two Factor Analysis of Variance
New Concepts
- Main effects versus interactions
The Linear Model
- Same assumptions as above,
- Except, assume that the data is well described by the model:
overall_mean + a_i + b_j + ab_ij + error
- There is one degree of freedom for the overall mean
- There are M_a - 1 degrees of freedom for the levels of the
first factor
- There are M_b - 1 degrees of freedom for the levels of the
second factor
- There are M_a * M_b - M_a - M_b
+ 1 degrees of freedom for the interaction
- There are N - M_a * M_b degrees of
freedom for the error
- Three ways to consider the model
- Main effects only, no interactions
- Main effects plus interactions
- Interactions only, no main effects
- Null hypothesis: All of the terms a_i, b_j,
ab_ij are zero.
The Calculation
- Remove the overall mean from the data, calculate the total variance
- Separate the data into cells one for each pair of levels of the two
factors
- Find the a_i that best fit the data, which are just the means
of the corresponding cells.
- Find the b_i that best fit the data, which are just the means
of the corresponding cells.
- Calculate the remaining variance, which is the error variance.
- The difference between the total variance and the error variance is the
treatment variance.
- Form the ratio of the treatment variance for a and b
and the error variance with degrees of freedom taken into account.
- Check against the percentage points of the F distribution.
- If the a result is significant then at least one of the
coefficients a_i of the model is different fram zero.
- If the b result is significant then at least one of the
coefficients b_i of the model is different fram zero.
- Find the ab_ij that best fit the left-over data in each
cell
- Form the ratio
- Check against the F distribution
Two Cell Mean Tables
- Factor A: Block Transfer Size
- Level 1: 1 word
- Level 2: 16 words
- Level 3: 256 words
- Factor B: Cache Line Size
- Level 1: 4 words
- Level 2: 16 words
- Level 3: 64 words
Cell Means
|
Factor B: Level 1 |
Factor B: Level 2 |
Factor B: Level 3 |
Factor A:
Normalized Averages |
| Factor A: Level 1 |
8, -30.3 |
32, -6.3 |
128, 89.7 |
56, 17.7 |
| Factor A: Level 2 |
5, -33.3 |
17, -21.3 |
68,29.7 |
30, -8.3 |
| Factor A: Level 3 |
5, -33.3 |
17, -21.3 |
65, 26.7 |
29, -9.3 |
Factor B:
Normalized Averages |
6, -32.3 |
22, -16.3 |
87, 48.7 |
|
Predicted Cell Means without interaction
|
Factor B: Level 1 |
Factor B: Level 2 |
Factor B: Level 3 |
| Factor A: Level 1 |
-14.7 |
1.3 |
66.3 |
| Factor A: Level 2 |
-40.7 |
-24.7 |
40.3 |
| Factor A: Level 3 |
-41.67 |
-25.7 |
39.3 |
ANOVA Tables
| Factor A |
11042 |
2 |
5521 |
30.3 |
| Factor B |
1406 |
2 |
703 |
0.4 |
|
|
|
|
|
| Remaining Error |
1276 |
7 |
182.3 |
|
| Total Error |
13724 |
|
|
|
Experimental Design
Design for analysis
Simulation
Models
Based on a model, which is an abstraction of the system.
- queueing models
- requests arrive
- wait
- get service
- depart
- one queue models
- multiple queue models
- require a scheduling mechanism (scheduling protocol)
- queueing networks
Types of models
- stochastic models
- define state
- continuous time models
- discrete state models
Model development
- You did this before
- Understand the system
- What is the goal?
- Determine the components
- e.g., server, job
- sometimes called `entities'
- have attributes, e.g.
- server: capacity
- job: service required
- system (client?): interarrival time
- Cheat and steal
- Select the type of queueing model
- single server queue,
- single service facility with multiple servers,
- network of queues
- Specify attributes that need algorithms
- e.g., scheduling disciplines for resources
- Specify workload parameters and performance metrics
- remember (guess what?) the goal
Example. Routing for automated telephone support
- single server, two classes of request
- queueing models have a scheduling algorithm, such as
- FCFS
- priority-based FCFS
- round robin, aka take turns
- each has a natural structure
- FCFS: single queue
- priority-based: multiple queues
- round-robin: multiple queues
- with multiple classes
- arrival rate adds: r = r1 + r2
- interarrival time adds reciprocals: t = 1/r = 1/(r1 + r2 ) =
1/(1/t1 + 1/t2) = t1*t2 / (t1 + t2)
or 1/t = 1/t1 + 1/t2
- Parameters
- interarrival time: per class
- length of transaction: per class
- service capacity: per class
- queueing algorithm
- Performance metrics
- waiting time: per class
This is used to determine the cycle time of the advertising you
listen to. Just joking!
- utilization
No point in hiring any more support staff than you need.
For a more CS-like example see this pdf.
Return to: