# CS457 - System Performance Evaluation - Winter 2008

1. A2.
2. Statistics 206

# Analysing Data

You have some data. What next?

1. Can you see any patterns? Exploratory data analysis.
• Draw graphs, look at outliers
2. Are the patterns real?
• Test hypotheses
3. How big are the patterns?
• Estimate parameters
4. Does it matter? Should you do anything about it?
• Consider the parameters in their technical context

Reprise the patterns we saw in the application of Little's Law.

#### Terminology

• Response variable, aka dependent variable
• This is what you measure
• Factors, aka independent variables
• Levels, values a factor takes on in the experiment
• Factors and levels are chosen when an experiment is designed
• Experiment
• A unique set of levels and the value determined for the response variable.
• Replications
• The number of runs of an experiment
• Design
• Collection of factors, levels, replications

## Zero-factor Designs

Reminder from STAT206

1. Measure a response variable r times. This is the sample of measurements.
2. Calculate the average.
3. Can you conclude that the average is greater than zero?

The standard technique

1. Assume that the variation is additive and normally distributed
2. Calculate the average
3. Calculate the standard deviation of the sample.
4. Calculate the T-statistic.
5. Look up in a table and find the number you calculated.

A different technique

1. Assume that each data point has the form yi = \mu + ei, and that ei is a sample from a normal distribution
2. Calculate the average, which is your estimate for \mu
3. Calculate the variance of {yi}, SST, which has a chi-square distribution
4. Calculate the variance of {ei = yi - \mu}, SSE, which has a chi-square distribution
5. Calculate the difference, SS0, which has a chi-square distribution
6. Calculate the ratio SS0/SSE, which has an F distribution
7. Look up in a table to find the number you calculated

This technique is called analysis of variance

• What the variance would be with one model, yi = ei.
• What the variance would be with a different model, yi = \mu + ei

Technical point. Degrees of freedom.

## One-factor Designs

1. Factor has a levels.
2. Model is yij = \mu + \alpha_j + eij
• linear
• independent, identically normally distributed error
3. Estimate \mu by \mu = (1/ar) \sum_ij yij
4. Variance is SST = \sum_ij (yij - \mu)^2 =? \sum_ij (yij)^2 - ar(\mu)^2
5. Estimate \alpha_j by by \alpha_j = (1/r) \sum_i (yij - \mu)
• Now \sum_i e_ij = \sum_i (yij - \mu -\alpha_j) = 0.
• And \sum_ij \alpha_j = r \sum_j \alpha_j = \sum_ij (yij - \mu - e_ij) = \sum_ij (yij - \mu) = 0
• These are given as assumptions in the textbook