CS452 - Real-Time Programming - Spring 2017

Lecture - Magellan Spacecraft

Error Handling

We have mentioned several times in this course that when you detect an error

you usually have two choices,
  1. programmatically adjust internal parameters and see if the error persists, or
  2. abandon execution, find the error and correct it using the usual editor/compiler method.
Sometimes the second is more complex than what you experience in the lab. An extreme example occurs when a spacecraft is millions of kilometers from earth and a bug appears in its stabilization system. (A spacecraft is subject to random -- solar wind, etc. - forces that will gradually destabilize its orientation and rotational velocity. Therefore is has a computer controlled system that supplies negative feedback to changes in orientation to stabilize the spacecraft.)

The spacecraft

The journey

  1. May 1989: left earth by space shuttle via elliptic orbit around sun.
  2. October 1989: passed close to Venus orbit, Venus not there.
  3. March 1990: arrived again at earth orbit, Earth not there.
  4. August 1990: arrived at Venus orbit, Venus there. Fire thrusters to go into orbit around Venus. This is an operation where the spacecraft slows down.

Consider a spacecraft, millions of kilometers away. You can't press the reset button. Here's how the Magellan spacecraft handled such a problem. Start with the symptoms.

  1. The spacecraft arrived at Venus in an unstable orbit with respect to Venus (not with respect to the sun), and had to fire its thrusters to change the orbit to a stable one. The firing had to occur when the spacecraft was behind Venus, out of sight from earth.
  2. The spacecraft passed the terminator (the edge of the planet) and disappeared.
  3. About an hour later it was supposed to reappear at the other side of Venus and send a signal to earth. Nothing.
  4. The earth antenna started broadcasting a guide signal toward Venus.
  5. Twenty-one hours later, the spacecraft said hello, by sending a prompt, '>'.
  6. The earth antenna sent the four bytes CRUZ followed by an operating system kernel, which was really a cyclic executive plus a program that reads sensors sensors and initiates appropriately scaled actions in the spacecrafts attitude control system.
  7. Five days later they again tried to change the orbit.
  8. This time it was only thirteen hours before the spacecraft answered.

What had happened?

  1. Solar panels normally aligned perpendicular to sun.
  2. Periodically a small telescope is pointed at a guide star to correct the attitude of the spacecraft.
  3. Reflection inside the barrel of the telescope thought to be guide star.
  4. Unforeseen value causes some memory to be rewritten.
  5. Computer crashes.
  6. Watchdog timer times out.
  7. Computer reboots.
  8. Part of rebooting is finding out where you are and what direction you are pointed.
  9. Helical scan allows solar panels to find sun: one axis established.
  10. Small antenna scans looking for guide signal from earth; second axis defined, communication re-estabished.
  11. New core image downloaded.
  12. Programmers on earth can start debugging.
Fortunately, the erroneous orbit of the rebooting spacecraft was a stable one.

Return to: