CS457 - System Performance Evaluation - Winter 2008
Questions and Comments
- A2.
- Statistics 206
Lecture 14
Analysing Data
You have some data. What next?
- Can you see any patterns? Exploratory data analysis.
- Draw graphs, look at outliers
- Are the patterns real?
- How big are the patterns?
- Does it matter? Should you do anything about it?
- Consider the parameters in their technical context
Reprise the patterns we saw in the application of Little's Law.
Terminology
- Response variable, aka dependent variable
- Factors, aka independent variables
- Levels, values a factor takes on in the experiment
- Factors and levels are chosen when an experiment is designed
- Experiment
- A unique set of levels and the value determined for the response
variable.
- Replications
- The number of runs of an experiment
- Design
- Collection of factors, levels, replications
Zero-factor Designs
Reminder from STAT206
- Measure a response variable r times. This is the sample of
measurements.
- Calculate the average.
- Can you conclude that the average is greater than zero?
The standard technique
- Assume that the variation is additive and normally distributed
- Calculate the average
- Calculate the standard deviation of the sample.
- Calculate the T-statistic.
- Look up in a table and find the number you calculated.
A different technique
- Assume that each data point has the form yi = \mu + ei, and that ei is
a sample from a normal distribution
- Calculate the average, which is your estimate for \mu
- Calculate the variance of {yi}, SST, which has a chi-square
distribution
- Calculate the variance of {ei = yi - \mu}, SSE, which has a chi-square
distribution
- Calculate the difference, SS0, which has a chi-square distribution
- Calculate the ratio SS0/SSE, which has an F distribution
- Look up in a table to find the number you calculated
This technique is called analysis of variance
- What the variance would be with one model, yi = ei.
- What the variance would be with a different model, yi = \mu + ei
Technical point. Degrees of freedom.
One-factor Designs
- Factor has a levels.
- Model is yij = \mu + \alpha_j + eij
- linear
- independent, identically normally distributed error
- Estimate \mu by \mu = (1/ar) \sum_ij yij
- Variance is SST = \sum_ij (yij - \mu)^2 =? \sum_ij (yij)^2 -
ar(\mu)^2
- Estimate \alpha_j by by \alpha_j = (1/r) \sum_i (yij - \mu)
- Now \sum_i e_ij = \sum_i (yij - \mu -\alpha_j) = 0.
- And \sum_ij \alpha_j = r \sum_j \alpha_j = \sum_ij (yij - \mu -
e_ij) = \sum_ij (yij - \mu) = 0
- These are given as assumptions in the textbook