CS457 - System Performance Evaluation - Winter 2010

Public Service Announcements

  1. Graduate school
  2. Added books to References

Lecture 10 - Exploratory Data Analysis


The collection of probes inserted to generate traces.


What other reasons might you have for wanting to measure performance?

What do we try to measure?

  1. Arrival and departure times of requests, to get
  2. Processor activity, such as
  3. Other resource activity, particularly NICs
  4. Failures

Levels of Measurement

  1. Application code
  2. Profiling tools
  3. Operating system
  4. Kernel
  5. Hardware

The Big Question

How much does the introduction of monitoring software influence the performance of the application.

  1. Event-driven monitors
  2. Sampling monitors



What I have called a log is usually called a trace, which is

The nature of logs

Tools for looking at logs

Exploratory Data Analysis (EDA)

Data is most conveniently handled in the form of records

independent variable 1 (IV1), IV2, IV3, ..., dependent (measured) variable 1 (DV1), DV2, DV3, ...

Typical independent variables

Typical dependent variables

Rule 1

Use a subset of the data if there will not be more available.

Rule 2

User your eyes

Rule 3

Split the data

  1. You see an indistinct pattern.
  2. Select the data in which the pattern exists: the pattern is now distinct
  3. Ask how the selected data differs from the unselected data

Potentially Useful Displays of Data

  1. Bar charts & box plots
  2. Histograms & scatter plots
  3. Three-dimensional plots
  4. Summaries of partitioned data

Important: Look for data that doesn't belong, the look back at it and its environment in the log.

Return to: