CS452 - Real-Time Programming - Winter 2016

Lecture 27 - Reservations, Pathologies.

Public Service Annoucements

  1. Train Control I demo on Friday, 11 March.
  2. The exam will start at 12.30, April 19, 2016 and finish at 15.00, 20 April 2016.


Multi-Train Control

By the next milestone you will be able to control two trains at the same time.

Sensor Attribution

Route Finding and Following

Collision Avoidance

Policy

Collision avoidance is the goal. We want a policy that controls how the track server gives out track, and how the train uses the track that it gets. The policy should have two properties.

  1. Its should be easy to prove to yourself that the policy prevents collisions.
  2. The policy should be easy to implement, taking into account the real properties of the trains and the track.

Here is an example of a typical policy.

  1. The server ensures that all parts of the track have exactly one owner, where the server is a possible owner.
  2. A train may only occupy track that it owns.
  3. A train can only operate on track it owns. In practice, "operate on" means "switching turn-outs".
  4. Track should be returned to the track server as soon as a train leaves it.
  5. To avoid leapfrog deadlocks, all the track owned by a train must be contiguous.


Reservations

Somebody has been doing something right for over a century. The answer is reservations.

Two Level Train Control

The two levels are completely independent of one another. The upper level determines which track is given to trains; the lower level is a set of rules that a train driver must obey when driving.

Upper Level

Here's roughly how things are done on track subject to centralized traffic control.

  1. Train asks dispatcher for a route.
  2. Dispatcher provides a route that he/she expects to be conflict free.
  3. Train follows the route, reporting back to the dispatcher as landmarks (sensors) are passed.

Lower Level

The lower level is also communicated by the coloured lights. In cases of conflict between the upper and lower levels, the lower level overrides the upper level.

Something Essential that You Must Do

Design your reservation system before coding it.

Before coding your reservation system work it out on paper and make sure that it works for all the generic cases you can think of

  1. One train following another
  2. Two trains on a collision course
  3. Short routes with much switching
  4. Single point failures.

There are one or more switches in the path

Implementing the policies.

No over-driving

Here is the sequence of events that occurs when a train stops.

  1. The train has enough track to continue driving.
  2. The train decides that it needs more track.
  3. The train requests track, and is turned down.
  4. The train gives a "speed zero" command.
  5. The train slows, stopping one stopping distance from where it gave the sp 0 command.
If the train is to avoid overdriving its track when does step 2 occur?

Operating reserved track

How much must be controlled to ensure that this constraint is respected? Try listing all the things that might go wrong? Are willing to trust the train driver?

Single owner

Something atomic, presumably a server, has to control who has what track.

No leapfrog deadlock

The server can control this. Or the server can be less smart and assume that input from the train is reliable. Then the train driver must be programmed to asked for pieces of track in the right order.

Returning reservations

Based on past history train drivers very commonly make errors when giving back reserved track. But once they have the track it's hard to get it back from them.

In the past students have experimented with timed reservations, where the reservation is reused, like it or not, when its time has expired. Results have not been good. How can a server be sure that a train driver can exit a reservation before a pre-specified time? How can a train driver figure out that it won't leave in time, and if so what can it do?


Pathologies

As we go down this list both pathology detection and the length of the edit-compile-test cycle grow without bound.

1. Deadlock

One or more tasks will never run again. For example

  1. Task sends to itself (local: rest of system keeps running, task itself will never run)
  2. Every task does Receive( ) (global: nothing is running, except possibly the idle task; all tasks are SEND_BL)
  3. Cycle of tasks sending around the cycle. All tasks in the cycle are RCV_BL. (local: other tasks keep running)
  4. One train is on a siding trying to get out; another train is on the switch at the end of the siding trying to get in. Here "deadlock" means neither of the trains will ever move again. (external: application okay but the world is in an unanticipated configuration. "Unanticipated" means that no code was implemented to deal with this case.)
  5. Three trains are trying to travel around one of the small loops at a velocity such that each of them requires more than a third of the loop in front of it.

Kernel can detect the first three; only the train application can detect the latter two.

Potential deadlock can be detected at compile time

Solutions

2. Livelock (Deadly Embrace)

Definition

Two or more tasks are READY. For each task, the state of other tasks prevents progress being made regardless of which task is ACTIVE.

A higher level of coordination is required.

There are two types of livelock

  1. Ones that are the result of bad coding
  2. Ones that are inherent in the application definition

Livelock usually occurs in the context of resource contention, and in the context of a train application, the resource is most often track ownership.

Livelock that's Really Deadlock

Solutions

  1. Make a single compound resourse, BUT
  2. Impose a global order on resource requests that all clients must follow.
  3. Create a mega-server that handles all resource requests

Real Livelock

Proprietor1 & proprietor2 fail the requests

Livelock that's Really a Critical Race

We could try to make the clients a little more considerate

    while ( no resources ) {
      Send( prop1, getres1, result );
      while ( result == "sorry" ) {
         Delay( ... );
         Send( prop1, getres1, result );
      }
      Send( prop2, getres2, result );
      while ( result == "sorry" ) {
        Send( prop1, relres1, ... );
        Delay( ... );
      }
    }
  
or even more considerate
    while ( true ) {
      Send( prop1, getres1, result );
      while ( result == "sorry" ) {
         Delay( ... );
         Send( prop1, getres1, result );
      }
      Send( prop2, getres2, result );
      if ( result == "sorry" ) {
        Send( prop1, relres1, ... );
      } else {
	break;
      }
      Delay( ... );
      }
    }
  
This we call a critical race because avoiding what is effectively an infinite loop depends on the timing of execution.

How quickly code like this escapes from the critical race depends on the argument you give to Delay(...).
If it's the same constant, which is common because both clients are running the same code, the delay can persist for a long time.
If it's random, and long enough to re-order the execution order of the code, then the deadlock will not long persist.

Inherent Livelock

Remember the example where two trains come face to face, each waiting for the other to move. They will wait facing each other until the demo is over, probably polling.

What's hard about solving this problem?

In real life, problems like this are normally detected and solved by the dispatcher, and if the dispatcher doesn't notice then they are solved by the train drivers asking the dispatcher what to do. In your application

What's probably easiest is for you to do is to programme each driver with

  1. generalized detection, e.g.,
  2. A typical work around continues as though the track is blocked.

3. Critical Races

Example

  1. Two tasks, A & B, at the same priority
  2. A is doing a lot of debugging IO
  3. B always reserves a section of track before A, and all is fine.
  4. Debugging IO is removed
  5. A reserves the section before B can get it, and execution collapses.
  6. Lower priority of A to the same level as C.
  7. Now C executes faster and gets a resource before D .
  8. You shuffle priorities forever, eventually reverting, to put back in the debugging IO.

Definition

A specific order of computation is required for successful execution. You have already met a critical race when you programmed around the CTS hardware bug in the last part of your kernel.

Critical races, like Livelock can be

Symptoms

  1. Small changes in priorities change execution unpredictably, and drastically.
  2. Debugging output changes execution drastically.
  3. Changes in train speeds change execution drastically.

`Drastically' usually means chaos in both senses of the term

  1. Sense one: a small change in the initial conditions produces an exponentially growing divergence in the execution.
  2. Sense two: exercise for the reader.

Solutions

  1. Explicit synchronization
  2. Gating is a technique of global synchronization

4. Performance

Changes in performance of one task with respect to another often give rise to critical races

The hardest problem to solve

In practice, how do you know you have performance problems? Problems I have seen: