CS452 - Real-Time Programming - Spring 2011
Lecture 26 - Pathologies
Pubilic Service Announcement
- Project proposals.
- Train availability.
- Second train control demo. `I must have been insane.'
Pathologies
As we go down this list both pathology detection and the time of an
edit-compile-test cycle grow without bound.
1. Deadlock
One or more tasks will never run again. For example
- Task sends to itself (local: rest of system keeps running, task itself
will never run)
- Every task does Receive( ) (global: nothing is running)
- Cycle of tasks sending around the cycle (local: other tasks keep
running)
Kernel can detect such things
Potential deadlock can be detected at compile time
- cycle in the send graph of all sends that could happen
- doesn't necessarily occur at run-time
- that is, it's a necessary but not sufficient condition
- Changes in a critical race can make a potential deadlock reveal
itself.
Solutions
- Gating
- Most common example is initialization, where the send/receive
pattern may be different than FOREVER
- Gate the end of initialization
- Define three types of task
- Administatrer (A): only receives
- Worker (W): only sends
- Client (C): only sends
- Two A tasks cannot communicate directly; two W/C tasks cannot
communicate directly.
- For W/C tasks
Send appears in two flavours
- C tasks
FOREVER {
Send( A, request, result )
...
}
- W tasks
FOREVER {
Send( A, result, request )
...
}
- The corresponding
Receives are also
different.
- Data types of request and result must be compatible!
- A courier converts one type to another
FOREVER {
Send( A1, request, result )
Send( A2, result, request )
}
2. Livelock (Deadly Embrace)
Definition
Two or more tasks are READY. For each task, the state of the other tasks
prevents progress being made regardless of which task is ACTIVE.
A higher level of coordination is possible.
Two types of livelock exist
- Ones that are the result of bad coding
- Ones that are inherent in the application definition
- Detect livelock and work around it.
Looking for solutions we prefer ones that avoid the central planner
Usually occurs in the context of resource contention
Livelock that's Really Deadlock
- client1 needs resource1 & resource2;
- obtains resource1 from proprietor1;
- asks proprietor2 for resource2
- client2 needs resource1 & resource2;
- obtains resource2 from proprietor2;
- asks proprietor1 for resource1
- possible code
- Client 1
Send( prop1, getres1, ... );
Send( prop2, getres2, ... );
// Use the resources and release them
- Client 2
Send( prop2, getres2, ... );
Send( prop1, getres1, ... );
// Use the resources and release them
- Proprietor
FOREVER {
Receive( &clientTid, req, ... );
switch ( req-type ) {
case REQUEST:
if( available ) {
Reply( clientTid, use-it, ... );
available = false;
}
else enqueue( clientTid );
case RELEASE:
available = true;
Reply( clientTid, "thanks", ... );
if( !empty( Q ) ) {
available = false;
Reply( dequeue( ), use-it, ... );
}
}
}
- state:
- client1, client2: REPLY-BLOCKED - can't release resources
- proprietor1, proprietor2: SEND-BLOCKED - waiting for release
- this is a true deadlock, but there are no cycles in the call graph.
The dependencies lie elsewhere. (You can find on the internet
arguments about terminology just as intense as anything you will ever
see in vi vs emacs or Apple vs Microsoft.)
Solutions
- Make a single compound resourse, BUT
- all clients may not need both
- some resources simply cannot be compounded
- Impose a global order on resource requests that all clients must
follow.
- unsafe against malicious or incompetent programmers
- some resources don't admit strong enough ordering
- Create a mega-server that handles all resource requests
- clients request all at once
- client might not know that A is needed until processing with B is
well-advanced
Real Livelock
Proprietor1 & proprietor2 fail the requests
- Proprietor
FOREVER {
Receive( &clientTid, req, ... );
switch ( req-type ) {
case REQUEST:
if( available ) {
Reply( clientTid, use-it, ... );
available = false;
}
else Reply( clientTid, "sorry", ...);
case RELEASE:
available = true;
Reply( clientTid, "thanks", ... );
}
}
- Polling is the most likely result. Typical client code.
while ( Send( prop1, getr1, ... ) != GotIt ) ;
while ( Send( prop2, getr2, ... ) != GotIt ) ;
// Use the resources
Livelock that's Really a Critical Race
We could try to make the clients a little more considerate
While ( no resources ) {
Send( prop1, get1es1, result );
while ( result == "sorry" ) {
if ( result == "sorry" ) {
Delay( ... );
Send( prop1, getres1, result );
}
Send( prop2, getres2, result );
if ( result == "sorry" ) {
Send( prop1, relres1, ... );
Delay( ... );
} else {
break;
}
}
Inherent Livelock
Remember the example where two trains come face to face, each waiting for
the other to move. They will wait facing one another until the demo is over,
probably polling.
What's hard about solving this problem?
- Neither engineer knows what the other engineer is trying to do.
In real life,
- the engineers would communicate, but
- in your software that's neither easy nor desirable
What's most easy for you to do is to programme each engineer with
- detection, e.g.,
- Delay some time
- Request again
- If turned down, work around
- work around, e.g.,
- Recommence working on goal as though track is blocked.
3. Critical Races
Example
- Two tasks, A & B, at the same priority
- A is doing a lot of debugging IO
- B always reserves a section of track before A, and all is fine.
- Debugging IO is removed
- A reserves the section before B can get it, and execution
collapses.
- Lower priority of A to the same level as C.
- Now C executes faster and gets a resource before D .
- You shuffle priorities forever, eventually reverting, to put back in
the debugging IO.
Definition
The order in which computation is done is an important factor in
determining whether or not it is successful.
Critical races, like Livelock can be
- internal to the program, like the one above, or
- external to the program but inherent in the application
Symptoms
- Small changes in priorities change execution unpredictably, and
drastically.
- Debugging output changes execution drastically.
- Changes in train speeds change execution drastically.
- Example from two terms ago
`Drastically' means chaos in both senses of the term
- Sense one: a small change in the initial conditions produces an
exponentially growing change in the system
- Sense two: exercise for the reader.
Solutions
- Explicit synchronization
- but you then have to know the orders in which things are permitted
to occur
- Gating is a technique of global synchronization
- which can be provided by a detective/coordinator
4. Performance
The hardest problem to solve
- You just don't know what is possible
- Ask a question like:
- Is my kernel code at the limit of what is possible in terms of
performance?
- We can compare the performance on message passing, etc., because
two kernels are pretty much the same.
- Compare a lot of kernels and you should be able to find a lower
limit
- Can't do the same thing for train applications
Priority
The hardest thing to get right
- Sizing stacks used to be harder, but now we have lots of memory
- NP-hard for the human brain
- Practical method starts with all priorities the same, then adjusts
- symptoms of good priority assignment
- The higher priority, the more likely the ready queue is to be
empty
- The shorter the run time in practice the higher the priority
Problems with priority
- Priority inversion
- One resource, many clients
- Tasks try to do too much
Congestion
- Too many tasks
- blocked tasks don't count,
- lowest priority tasks almost don't count
Layered abstraction are costly
e.g. Notifier -> SerialServer -> InputAccumulater -> Parser ->
TrackServer
Hardware
- Turn on optimization, but be careful
- There are places where you have done register allocation by
hand
- Turn on caches
Size & align calibration tables by size & alignment of cache
lines
- linker command script
- I think that this is stretching it.
Return to: