The easy part

The other easy part

The hard part.

Two things are hard.

- Dealing with measurement error
- Two categories of error
- Systematic = controllable
- Random = uncontrollable

- Two categories of error
- Two strategies for dealing with error
- Promotion
- Segregate data to promote random error into systematic error

- Be conservative
- The further you get in your project the more you will want to relax conservative assumptions

- Promotion
- Determining what state is relevant

Large collection of data records

Train | Speed | Section | Previous
Speed |
Time since
speed change (seconds) |
Time since
maintenance (hours) |
Cleanliness
of track |
Previous
speed (coded) |
Section
type (coded) |
Velocity
(cm/sec) |
||||||||||

25 | 8 | 31 | 10 | 23 | 76 | higher | curved | 8.9 | |||||||||||

How to manipulate the data

- Code data with lots of values

What to do with the data

- Remove the mean
- mean_velocity = (1/N) sum velocity

- Calculate the remaining variance
- variance = sum (velocity - mean_velocity)^2

- Form a linear model
- velocity = a(train) + b(speed) + c(section type) + ...

- Calculate the optimal values for each factor
- subdivide by factor value
- calculate the mean for each subdivision

- Find out which factors matter
- What fraction of the variance to they remove?
- Is the difference between velocities for different factor values worth worrying about?

The result is a collection of factors and values for each factor

- which are the ones that are worth considering

In reality a lot of intuition about the trains goes into the above judgment.

- Calibrate the day before

AND/OR

- Calibrate at the beginning of the demo

AND/OR

- Calibrate as the demo runs

Whatever you do you can't do ANOVA online

Consider this:

- You already know the factors and their values
- Allocate a value for each
- Initialize the value with a pre-estimate
- Each time you measure a velocity
- find the appropriate value
- update the value using something like new_value = a * new_measurement + (1-a) * current_value
- experiment to find a good value for a.

You might want to consider

- You are already doing a whole lot of measurements
- Average in a circular buffer to get variance estimate
- Turn on optimization, but be careful
- There are places where you have done register allocation by hand

- Size & align calibration tables by size & alignment of cache
lines
- linker command script

- Slowing and stopping
- each train has a built in velocity profile when stopping
- you can create your own velocity profile

Return to: