CS452 - Real-Time Programming - Spring 2011
Lecture 28 - Pathologies
Pubilic Service Announcement
- Train availability.
- Second train control demo. July 14.
Pathologies
As we go down this list both pathology detection and the time of an
edit-compile-test cycle grow without bound.
1. Deadlock
2. Livelock (Deadly Embrace)
Definition
Two or more tasks are READY. For each task, the state of the other tasks
prevents progress being made regardless of which task is ACTIVE.
A higher level of coordination is possible.
Two types of livelock exist
- Ones that are the result of bad coding
- Ones that are inherent in the application definition
- Detect livelock and work around it.
Looking for solutions we prefer ones that avoid the central planner
Usually occurs in the context of resource contention
Livelock that's Really Deadlock
- client1 needs resource1 & resource2;
- obtains resource1 from proprietor1;
- asks proprietor2 for resource2
- client2 needs resource1 & resource2;
- obtains resource2 from proprietor2;
- asks proprietor1 for resource1
- possible code
- Client 1
Send( prop1, getres1, ... );
Send( prop2, getres2, ... );
// Use the resources and release them
- Client 2
Send( prop2, getres2, ... );
Send( prop1, getres1, ... );
// Use the resources and release them
- Proprietor
FOREVER {
Receive( &clientTid, req, ... );
switch ( req.type ) {
case REQUEST:
enqueue( clientQ, req );
if( available( db, req.resource ) ) Reply( dequeue( clientQ, req ), extract( db, req.resource ) );
break;
case RELEASE:
insert( db, req.resource );
Reply( clientTid, "thanks", ... );
foreach req in clientQ
if( available( db, client.resource ) ) Reply( client.tid, extract( db, resource );
break;
}
}
Solutions
- Make a single compound resourse, BUT
- all clients may not need both
- some resources simply cannot be compounded
- Impose a global order on resource requests that all clients must
follow.
- unsafe against malicious or incompetent programmers
- some resources don't admit strong enough ordering
- Create a mega-server that handles all resource requests
- clients request all at once
- client might not know that A is needed until processing with B is
well-advanced
Real Livelock
Proprietor1 & proprietor2 fail the requests
- Proprietor
case REQUEST:
if( available( freelist, req.resource ) ) Reply( req.tid, extract( freelist, req.resource ) );
case RELEASE:
insert( freelist, req.resource );
Reply( clientTid, "thanks", ... );
- Polling is the most likely result. Typical client code.
while ( ) {
Send( prop1, getr1, result );
if ( result == GotIt ) break;
}
while ( ) {
Send( prop2, getr2, result );
if ( result == GotIt ) break;
}
// Use the resources
Livelock that's Really a Critical Race
We could try to make the clients a little more considerate
while ( no resources ) {
Send( prop1, get1es1, result );
while ( result == "sorry" ) {
Delay( ... );
Send( prop1, getres1, result );
}
Send( prop2, getres2, result );
if ( result == "sorry" ) Send( prop1, relres1, ... );
}
Inherent Livelock
Remember the example where two trains come face to face, each waiting for
the other to move. They will wait facing one another until the demo is over,
probably polling.
What's hard about solving this problem?
- Neither engineer knows what the other engineer is trying to do.
In real life,
- the engineers would communicate, but
- in your software that's neither easy nor desirable
What's most easy for you to do is to programme each engineer with
- detection, e.g.,
- Delay some time
- Request again
- If turned down, work around
- work around, e.g.,
- Recommence working on goal as though track is blocked.
3. Critical Races
Example
- Two tasks, A & B, at the same priority
- A is doing a lot of debugging IO
- B always reserves a section of track before A, and all is fine.
- Debugging IO is removed
- A reserves the section before B can get it, and execution
collapses.
- Lower priority of A to the same level as C.
- Now C executes faster and gets a resource before D .
- You shuffle priorities forever, eventually reverting, to put back in
the debugging IO.
Definition
The order in which computation is done is an important factor in
determining whether or not it is successful.
Critical races, like Livelock can be
- internal to the program, like the one above, or
- external to the program but inherent in the application
Symptoms
- Small changes in priorities change execution unpredictably, and
drastically.
- Debugging output changes execution drastically.
- Changes in train speeds change execution drastically.
- Example from two terms ago
`Drastically' means chaos in both senses of the term
- Sense one: a small change in the initial conditions produces an
exponentially growing change in the system
- Sense two: exercise for the reader.
Solutions
- Explicit synchronization
- but you then have to know the orders in which things are permitted
to occur
- Gating is a technique of global synchronization
- which can be provided by a detective/coordinator
Inherent Critical Races
Above we recommended solving livelock inherent to the train set by a
- detect
- back-off
- work around
strategy. With this strategy we don't know which train will back off, only
that almost all the time only one train will back off. Thus, in the long run,
there are three possible futures
- Train A backs off; train B follows through.
- Train B backs off; train A follows through.
- Both train A and train B back-off.
Which future occurs is the result of who wins a critical race. Yet this
the existence of this critical race is not necessarily a bug. If all
three futures accomplish the goal of your project, the critical race does not
determine whether or not you achieve your goal, only the specific way in
which the goal is achieved.
Watch the trains when such a critical race occurs. In my observation this
is when they are most likely to look intelligent.
4. Time Lags
You can't send to the train controller a command like
- `When train 23 is ten centimetres past sensor A2 set its speed to
zero.'
You can only send a command like
- `Set the speed of train 23 to zero now.'
When you send this command depends on your knowledge of the size of the
delay times the velocity of the train. (There are ways you have probably
already discovered where you don't need separate measurements of these two
qualtities. Think about what stopping distance actually is.)
Hunting
Oscillation
5. Performance
The hardest problem to solve
- You just don't know what is possible
- Ask a question like:
- Is my kernel code at the limit of what is possible in terms of
performance?
- We can compare the performance on message passing, etc., because
two kernels are pretty much the same.
- Compare a lot of kernels and you should be able to find a lower
limit
- Can't do the same thing for train applications
Priority
The hardest thing to get right
- Sizing stacks used to be harder, but now we have lots of memory
- NP-hard for the human brain
- Practical method starts with all priorities the same, then adjusts
- symptoms of good priority assignment
- The higher priority, the more likely the ready queue is to be
empty
- The shorter the run time in practice the higher the priority
Problems with priority
- Priority inversion
- One resource, many clients
- Tasks try to do too much
Congestion
- Too many tasks
- blocked tasks don't count,
- lowest priority tasks almost don't count
Layered abstraction are costly
e.g. Notifier -> SerialServer -> InputAccumulater -> Parser ->
TrackServer
Hardware
- Turn on optimization, but be careful
- There are places where you have done register allocation by
hand
- Turn on caches
Size & align calibration tables by size & alignment of cache
lines
- linker command script
- I think that this is stretching it.
Return to: