CS452 - Real-Time Programming - Winter 2016

Lecture 28 - Pathologies.

Public Service Annoucements

  1. Train Control II demo on Friday, 25 March.
  2. The exam will start at 12.30, April 19, 2016 and finish at 15.00, 20 April 2016.


Multi-Train Control

Collision Avoidance

Policy

Here is an example of a typical policy.

  1. The server ensures that all parts of the track have exactly one owner, where the server is a possible owner.
  2. A train may only occupy track that it owns.
  3. A train can only operate on track it owns. In practice, "operate on" means "switching turn-outs".
  4. Track should be returned to the track server as soon as a train leaves it.
  5. To avoid leapfrog deadlocks, all the track owned by a train must be contiguous.

Something Essential that You Must Do

Test your policies before implementing them.

Before implementing your policies work them out on paper and make sure that they work for all the generic cases that will be important in your project. For example,

  1. One train following another
  2. Two trains on a collision course
  3. Short routes with much switching
  4. Single point failures.
  5. One or more switches in the path.

Implementing the policies.

No over-driving

Here is the sequence of events that occurs when a train stops.

  1. The train driver has enough track to continue driving.
  2. The train driver decides that it needs more track.
  3. The train driver requests track, and is turned down.
  4. The train driver gives a "speed zero" command.
  5. The train slows, stopping one stopping distance from where it was when the sp 0 command was given.
If the train is to avoid overdriving its track when must step 2 occur?

Operating reserved track

How much must be controlled to ensure that this constraint is respected? Try listing all the things that might go wrong? Are willing to trust the train driver?

Single owner

A server controls who has what track. It must put returned track into something like a free list.

No leapfrog deadlock

The server can control this. Or the server can be less smart and assume that input from the train is reliable. Then the train driver must be programmed to asked for pieces of track in the right order.

Returning reservations

Based on past history train drivers very commonly make errors when giving back reserved track. But once they have the track it's hard to get it back from them.

In the past students have experimented with timed reservations, where the reservation is reused, like it or not, when its time has expired. Results have not been good. How can a server be sure that a train driver can exit a reservation before a pre-specified time? How can a train driver figure out that it won't leave in time, and if so what can it do?


Pathologies

As we go down this list both pathology detection and the length of the edit-compile-test cycle grow without bound.

1. Deadlock

2. Livelock (Deadly Embrace)

Definition

Two or more tasks are READY. For each task, the state of other tasks prevents progress being made regardless of which task is ACTIVE.

A higher level of coordination is required.

There are two types of livelock

  1. Ones that are the result of bad coding
  2. Ones that are inherent in the application definition

Livelock usually occurs in the context of resource contention, and in the context of a train application, the resource is most often track ownership.

Livelock that's Really Deadlock

Solutions

None of the normal solutions work with something like track.

Real Livelock

Proprietor1 & proprietor2 fail the requests

Livelock that's Really a Critical Race

We could try to make the clients a little more considerate

    while ( no resources ) {
      Send( prop1, getres1, result );
      while ( result == "sorry" ) {
         Delay( ... );
         Send( prop1, getres1, result );
      }
      Send( prop2, getres2, result );
      while ( result == "sorry" ) {
        Send( prop1, relres1, ... );
        Delay( ... );
      }
    }
  
or even more considerate
    while ( true ) {
      Send( prop1, getres1, result );
      while ( result == "sorry" ) {
         Delay( ... );
         Send( prop1, getres1, result );
      }
      Send( prop2, getres2, result );
      if ( result == "sorry" ) {
        Send( prop1, relres1, ... );
      } else {
	break;
      }
      Delay( ... );
      }
    }
  
This we call a potential critical race because who gets the resource and uses it first is non-deterministic.

Livelock in the Real World

Remember the example where two trains come face to face, each waiting for the other to move. They will wait facing each other until the demo is over, probably polling.

What's hard about solving this problem?

What's probably easiest is for you to do is to programme each driver with

  1. generalized detection, e.g.,
  2. A typical work around continues as though the track is blocked.

3. Critical Races

Example

  1. Two tasks, A & B, at the same priority
  2. A is doing a lot of debugging IO
  3. B always reserves a section of track before A, and all is fine.
  4. Debugging IO is removed
  5. A reserves the section before B can get it, and execution collapses.
  6. Lower priority of A to the same level as C.
  7. Now C executes faster and gets a resource before D .
  8. You shuffle priorities forever, eventually reverting, to put back in the debugging IO.

Definition

A specific order of computation is required for successful execution. You have already met a critical race when you programmed around the CTS hardware bug in the last part of your kernel.

Critical races, like Livelock can be

Symptoms

  1. Small changes in priorities change execution unpredictably, and drastically.
  2. Debugging output changes execution drastically.
  3. Changes in train speeds change execution drastically.

`Drastically' usually means chaos in both senses of the term

  1. Sense one: a small change in the initial conditions produces an exponentially growing divergence in the execution.
  2. Sense two: exercise for the reader.

Solutions

  1. Explicit synchronization
  2. Gating is a technique of global synchronization
But the solution you should like best is simply writing code that works regardless of the order of execution. Doing so is hard, but worth the trouble.

4. Performance

Changes in performance of one task with respect to another often give rise to critical races

The hardest problem to solve

In practice, how do you know you have performance problems? Problems I have seen: