CS457 - System Performance Evaluation - Winter 2010


Public Service Announcements

  1. Mid-term

Lecture 11 - Data Analysis I

Exploratory Data Analysis (EDA)

Rule 1

Use a subset of the data if there will not be more available.

Rule 2

Use your eyes

Rule 3

Split the data

  1. You see an indistinct pattern.
  2. Select the data in which the pattern exists: the pattern is now distinct
  3. Ask how the selected data differs from the unselected data

Potentially Useful Displays of Data

  1. Bar charts & box plots
  2. Histograms & scatter plots
  3. Three-dimensional plots
  4. Summaries of partitioned data

Important: Look for data that doesn't belong, the look back at it in the log, paying attention to its environment.


Motivation of Analysis of Variance (pdf)

Example - Reponse Time

What I am doing here is typical and schematic. When you are processing a log you will execute different commands than the exact ones I give.

Preparing the data

Do not use the comands below on any log file. Every log file has a different format. You can never avoid step 1 below.

  1. Study the log to find out how the information is formatted.
  2. Put a sequence number on the data in case you want to get the order back
  3. Get rid of all but arrivals and departures
  4. Sort the data on reqid.
  5. Could add on the system load at this point
  6. Now join the lines
  7. Now calculate the response time

Look at the Data

For example

Test the Tentative Conclusions

For example

  1. Histogram of response times made the response time seem to be about 12 msec.
  2. Calculate the average
    Suppose you get 11.7853609 msec. What do you think?
  3. Remember your statistics and calculate the sample variance and standard deviation.
  4. Look back at at your tentative conclusions. For example,

When your data has passed the tests

Measurement is finished. How do you use the results?

Example. You saw a difference of 20 msec between response time for reqtype1 (browsing) and reqtype2 (searching).

Example. You saw that each increment of 1 in the load makes an increment of 500 msec in the response time for browsing.

.


Return to: