One way to regard the interface is as a control problem. The computer receives input from (is controlled by) its input devices, which are controlled by the muscles of the user. It provides output to its output devices. The user receives his or her input from (is controlled by) the (five/six/seven) senses, which are stimulated by the computer's output devices. He or she provides output using muscles, which may or may not be supplemented by actuators. The control problem is dual, a feature that has been little noticed.

Here are several things to think about.

- The user is a feedback loop that controls the computer; and the computer is a feedback loop that controls the user. The concepts of control theory apply to these models.
- To understand the act of control we need to know:
- the dynamics of the user,
- the dynamics of the computer,
- the transmission characteristics (response function) of the interface.

- In general terms:
- the dynamics of the user is given (psychology, user models),
- the dynamics of the computer may be given ("Put an interface on this application."), or may be adjustable ("Program the application with an interface."),
- the transmission characteristics are where psychology and programming interact.

- Two important concepts:
- Controllability, "Is there a set of inputs I can give through the
interface that allow me to have the program do what I want?".
- "Is the end state I want accessible from here, at all?"
- E.g. Gravity causes a selection near a special point to be pulled to that point. It has the side effect of making it impossible to select very close to the special point.

- Observability, "Is there observable information that allows me to
figure out what inputs I need to give in order to get the program
into a desired state?" Such information includes:
- "What is the current state?",
- "What inputs are possible in the current state?".
- "What is the effect of the possible inputs?".

- This problem is intactable in general. But two extreme cases are
tractable; they demonstrate some of the most important concepts.
- Real-time control, with all cognitive factors withheld, e.g. following a moving target with a tracker,
- Discrete control, with all perceptual/motor factors ignored, e.g., navigating the web.,

- Controllability, "Is there a set of inputs I can give through the
interface that allow me to have the program do what I want?".

The simplest example of analogue control involves following a target with
a tracker. One target acquisition paradigm -- acquiring an icon with a
tracking device -- is the most studied single interface feature in the entire
HCI literature: Fitts' law. Using psychological studies of Fitts' law HCI
researchers have tried, for example, to link mouse usage to the
characteristics of human motor mechanisms. (To study Fitts' law behaviour a
target and tracker are simultaneously provided to the user who the 'acquires'
the target with the tracker as quickly as possible. Experiments tend to
yield expressions of the form

time_to_acquire = A log( D/S + B ) + other contributions

where S is the target size and D is the initial distance from tracker to
target. More about this in two weeks.)

Another target acquisition paradigm is the most expensively studied aspect of the human factors literature, the ability of a weapons aimer to maintain his or her sight on a moving target, something of considerable interest -- to understate the case -- to designers of military systems.

We are going to begin by doing a little mathematics, just for illustrative purposes, to show how complicated even very simple problems are. After that there will be a short review of some qualitative features that are important as analogue control is generalized.

What are the dynamics as a user moves a tracker to acquire a target?

- x(t) is the position of the tracker; xT is the position of the target.
- This is a case of feedback control.
- The current positions of tracker and target are observable.
- Any continuous change in position can be provided to the tracker.

We will progressively add complexity.

- Stationary target, instantaneous reponse: dx(t)/dt = -a (x(t) - xT).
- The idea:
- The user observers the relative positions of target and tracker.
- The relative positions are used to move whatever controls the tracker position,
- in a direction that decreases the mismatch, and
- with a speed proportional to the error.
- 'a' is a property of the human nervous and motor systems. Users are able to adjust it very effectively to accommodate for different task demands. (Remember speed-accuracy trade-offs.)

- Solution: x(t) - xT = (x(0) - xT ) * exp( -at ).
- Convergence as with Newton's method of finding roots; pathologies are similar. Increasing a speeds up the convergence.
- Time to converge to a particular criterion (final distance from the
target) is a logarithmic function of the initial distance, and of the
criterion (Fitts' law).
- Target size: s; target distance: d = x(0) - xT
- x(t) - xT < s implies exp(-at) < s/d
- t < 1/a log( d/s )

- The idea:
- Stationary target, lagged response: dx(t)/dt = -a (x(t - t0) - xT).
- t0 is the time it takes to see updated information, a property of the computer (response time to an input), and of the human visual system (response time to an input).
- Solution: x(t) - xT ~ exp( -zt )
- z a complex number; z = v + iu, u, v real numbers, i the square root of -1.
- v determines the damping (convergence); u determines the rate of oscillation. Damped oscillation is the thing you normally experience when you try to track using a system with poor response.
- u^2 + v^2 = a^2 exp( 2 v t0 ); u = v tan( u t0 )
- There are two solutions with no oscillation (u = 0): v = a exp( v t0 ). (Actually there are two more solutions but we are not interested in solutions that grow exponentially!) Both solutions converge more rapidly than the one with instantaneous response. As t0 goes to zero it becomes the same as the instantaneous solution. As t0 increases the rate of convergence becomes faster. Note that you can converge without oscillating only if you are able to move at exactly the right speed!
- The non-oscillatory solution occurs when a * t0 < 0.37, a magic number. Higher values of a, faster convergence, are possible, but oscillate, and the ratio of oscillation to convergence increases until a * t0 = 1.57, another magic number. Beyond that point convergence is no longer possible. Obviously humans can adjust a, but within what limits?

- Moving target, instantaneous reponse: dx(t)/dt = -a (x(t) - xT(t)).
- 'a' is a property of the human nervous and motor systems.
- Solution: x(t) ~ exp( at ) integral _to_t xT(t') exp( -at' ) dt'.
- Who cares? Nobody, but two points are important.
- Users usually do this effortlessly.
- This becomes very much more simple if we assume users estimate velocity of the target. Then the solution is exactly the same as with a stationary target. (Why?) Can users accomplish velocity estimation? This is simple predictive control, about which there is much written in engineering.

- Obviously, the real world, even of video games, is much more complex. Fortunately, we have already learned our lesson.

Mathematics of the type written above should immediately make us ask a few questions. For example,

- t0, the time lag, is a combination of human and computer lag times.
Computer lag times very much shorter than human lag times will be
unimportant. What are human lag times? There is, of course, a range, from
low values close to 50 milliseconds, to high values close to half a
second. Computer lags much shorter than 50 msec ought to be unimportant.
Where do these numbers come from?
- visual integration times: 50 to 200 milliseconds,
- cross-modal judgments of simultaneity: 100 milliseconds
- shortest response times: 300 miliseconds
- apparent motion: 30 to 200 milliseconds

- Oscillatory solutions depend on the ability of the human motor system to move, which is inhibited by inertia and by damping. How fast can humans oscillate? Close to 50 Hz. We can assume that oscillatory solutions much faster than about 5 Hz have to be modified by inertial and damping effects not included in such simple equations. Does this mean that high frequency oscillatory solutions don't matter? Not at all. When they occur users just give up!
- Lots of neat information of this type can be found in the
*Handbook of Perception and Human Performance*.

This is a very simple case. When we let things get more complicated qualitatively interesting things start to appear. The following points, which might go under the heading of "qualitative dynamics", are things we think about when thinking about the real-time properties of our interfaces. (If you are interested in doing more than just reading this list please see your nearest applied math textbook on dynamical systems, and work out the analogies to user interface behaviour.)

- Dynamical systems are usually defined by the following:
- dynamical variables, that describe how the system behaves;
- control parameters, that are used to modify system behaviour (These may be dynamical variables of another dynamical system!);
- equations, usually differential equations, that describe how the the dynamical variables are affected by the control parameters.

- "Degrees of freedom" is an extremely useful concept in analysing
dynamical systems:
- dynamical degrees of freedom, independent ways in which a system can respond;
- control degrees of freedom, independent ways in which a system can be controlled;
- a simple result -- the number of independently controllable dynamical degrees of freedom is less than or equal to the number of independent control degrees of freedom;
- connection of controllability and observability to degrees of freedom. (What is it?)

- When control parameters are changed the system normally settles down
toward a new state. These states are associated with attractors of the
dynamical system. Attractors come in three flavours:
- points
- limit cycles, speaking loosely, these are dense in the space of the dynamical variables
- strange attractors, not dense.

Many interfaces depend only on the order in which operations are performed, and not on when they are performed. Furthermore they have particularly simple input/output structures. Consider, for example, an interface that is used for filling and submitting a form, then receiving the results. Draw the state diagram for such an interface:

- List of states of program:
- Not accepting input.
- Accepting input for Field 1.
- Accepting input for Field 2.
- Processing form.
- Form accepted.
- Form rejected.

- List of states of interface:
- No field selected.
- Field 1 selected.
- Field 2 selected.
- Form submitted.
- Form returned: correct.
- Form returned: error.

- Actions that initiate transitions out of states:
- New form
- Choose field1
- Choose field2
- Submit form
- Remote response to submission
- Modal responses to returned forms

- Where do the transitions go?
- Exactly what action initiates a transition? And who does it, user or application?
- Is the interface controllable?
- Can the get to a desired state from the state he or she is in?

- Is the interface observable?
- Is there a visible control indicating affordances of a state?
- (One for each transition)
- What about long-range navigation?

- Are there pathologies?
- actions that are ignored,
- cycles you can't leave,
- dead ends,

- Have you put in something for everything that is "abnormal"?
- abandoned transactions,
- time outs,
- returning things (information, physical objects like bank cards),
- and so on.

This technique of interface design has obvious benefits, such as

- It's a tool that computer scientists already understand fairly well from other contexts. Transfer of learning.
- It's a tool with formal properties that can be exploited. For example,
- Every state must exit.
- There must be a well-defined start.
- "Subroutines" can be used.
- "Data" is reasonably understood.
- and so on.

- Application/interface/user synchronization is explicit.
- The technique is roughly application-independent.
- It can profitably be used for designing parts of an interface. E.g., vi, Adobe Illustrator.
- It can profitably be used for explaining parts of an interface. E.g., PBX interfaces.
- Automated tools are at least conceivable.
- Statecharts.

- There is no way of handling time that is better than temporal order. Time-outs are clearly a hack. (How long do you have to wait at a bank machine before the machine concludes that a transaction is incomplete?)
- Complexity kills it pretty fast. Consider the state diagram for an editor.
- Obviously more suited to some interface devices and some user tasks than others. For example, producing a business letter might seem similar to filling in a form. How useful is the state diagram we produced above?
- How much should a user be aware of the underlying state machine? Is there value to the user independent of such knowledge, which may be implicit?

Separate the interface from its presentation

- Each state has a presentation.
- Each input moves the system to a new state.

An interface really is a finite number of states plus transitions between them, BUT

- the number of states is huge, e.g. 50^3000 = 10^5100.

There are two easy to solve limits

- number of states -> one
- number of states -> infinity

Typical number of states is closer to infinity than it is to one! At infinity we look for algebras as generalizations of state diagrams.

Assume

- All states accessible.
- All transitions inverible.

Look for the closure. What do you get?

- Enumerate the states in any order