CS789, Spring 2005
Lecture notes, Week 3
Control
One way to regard the interface is as a control problem. The computer
receives input from (is controlled by) its input devices, which are
controlled by the muscles of the user. It provides output to its output
devices. The user receives his or her input from (is controlled by) the
(five/six/seven) senses, which are stimulated by the computer's output
devices. He or she provides output using muscles, which may or may not be
supplemented by actuators. The control problem is dual, a feature that has
been little noticed.
Here are several things to think about.
- The user is a feedback loop that controls the computer; and the
computer is a feedback loop that controls the user. The concepts of
control theory apply to these models.
- To understand the act of control we need to know:
- the dynamics of the user,
- the dynamics of the computer,
- the transmission characteristics (response function) of the
interface.
- In general terms:
- the dynamics of the user is given (psychology, user models),
- the dynamics of the computer may be given ("Put an interface on
this application."), or may be adjustable ("Program the application
with an interface."),
- the transmission characteristics are where psychology and
programming interact.
- Two important concepts:
- Controllability, "Is there a set of inputs I can give through the
interface that allow me to have the program do what I want?".
- "Is the end state I want accessible from here, at all?"
- E.g. Gravity causes a selection near a special point to be
pulled to that point. It has the side effect of making it
impossible to select very close to the special point.
- Observability, "Is there observable information that allows me to
figure out what inputs I need to give in order to get the program
into a desired state?" Such information includes:
- "What is the current state?",
- "What inputs are possible in the current state?".
- "What is the effect of the possible inputs?".
- This problem is intactable in general. But two extreme cases are
tractable; they demonstrate some of the most important concepts.
- Real-time control, with all cognitive factors withheld, e.g.
following a moving target with a tracker,
- Discrete control, with all perceptual/motor factors ignored,
e.g., navigating the web.,
Real problems actually combine both aspects at unanalysable levels of
detail.
Analogue Control
The simplest example of analogue control involves following a target with
a tracker. One target acquisition paradigm -- acquiring an icon with a
tracking device -- is the most studied single interface feature in the entire
HCI literature: Fitts' law. Using psychological studies of Fitts' law HCI
researchers have tried, for example, to link mouse usage to the
characteristics of human motor mechanisms. (To study Fitts' law behaviour a
target and tracker are simultaneously provided to the user who the 'acquires'
the target with the tracker as quickly as possible. Experiments tend to
yield expressions of the form
time_to_acquire = A log( D/S + B ) + other contributions
where S is the target size and D is the initial distance from tracker to
target. More about this in two weeks.)
Another target acquisition paradigm is the most expensively studied aspect
of the human factors literature, the ability of a weapons aimer to maintain
his or her sight on a moving target, something of considerable interest -- to
understate the case -- to designers of military systems.
We are going to begin by doing a little mathematics, just for illustrative
purposes, to show how complicated even very simple problems are. After that
there will be a short review of some qualitative features that are important
as analogue control is generalized.
What are the dynamics as a user moves a tracker to acquire a target?
- x(t) is the position of the tracker; xT is the position of the
target.
- This is a case of feedback control.
- The current positions of tracker and target are observable.
- Any continuous change in position can be provided to the tracker.
We will progressively add complexity.
- Stationary target, instantaneous reponse: dx(t)/dt = -a (x(t) - xT).
- The idea:
- The user observers the relative positions of target and
tracker.
- The relative positions are used to move whatever controls the
tracker position,
- in a direction that decreases the mismatch, and
- with a speed proportional to the error.
- 'a' is a property of the human nervous and motor systems. Users
are able to adjust it very effectively to accommodate for
different task demands. (Remember speed-accuracy trade-offs.)
- Solution: x(t) - xT = (x(0) - xT ) * exp( -at ).
- Convergence as with Newton's method of finding roots; pathologies
are similar. Increasing a speeds up the convergence.
- Time to converge to a particular criterion (final distance from the
target) is a logarithmic function of the initial distance, and of the
criterion (Fitts' law).
- Target size: s; target distance: d = x(0) - xT
- x(t) - xT < s implies exp(-at) < s/d
- t < 1/a log( d/s )
- Stationary target, lagged response: dx(t)/dt = -a (x(t - t0) - xT).
- t0 is the time it takes to see updated information, a property of
the computer (response time to an input), and of the human visual
system (response time to an input).
- Solution: x(t) - xT ~ exp( -zt )
- z a complex number; z = v + iu, u, v real numbers, i the square
root of -1.
- v determines the damping (convergence); u determines the rate of
oscillation. Damped oscillation is the thing you normally experience
when you try to track using a system with poor response.
- u^2 + v^2 = a^2 exp( 2 v t0 ); u = v tan( u t0 )
- There are two solutions with no oscillation (u = 0): v = a exp( v
t0 ). (Actually there are two more solutions but we are not
interested in solutions that grow exponentially!) Both solutions
converge more rapidly than the one with instantaneous response. As t0
goes to zero it becomes the same as the instantaneous solution. As t0
increases the rate of convergence becomes faster. Note that you can
converge without oscillating only if you are able to move at exactly
the right speed!
- The non-oscillatory solution occurs when a * t0 < 0.37, a magic
number. Higher values of a, faster convergence, are possible, but
oscillate, and the ratio of oscillation to convergence increases
until a * t0 = 1.57, another magic number. Beyond that point
convergence is no longer possible. Obviously humans can adjust a, but
within what limits?
- Moving target, instantaneous reponse: dx(t)/dt = -a (x(t) - xT(t)).
- 'a' is a property of the human nervous and motor systems.
- Solution: x(t) ~ exp( at ) integral _to_t xT(t') exp( -at' )
dt'.
- Who cares? Nobody, but two points are important.
- Users usually do this effortlessly.
- This becomes very much more simple if we assume users estimate
velocity of the target. Then the solution is exactly the same as
with a stationary target. (Why?) Can users accomplish velocity
estimation? This is simple predictive control, about which there
is much written in engineering.
- Obviously, the real world, even of video games, is much more complex.
Fortunately, we have already learned our lesson.
Mathematics of the type written above should immediately make us ask a few
questions. For example,
- t0, the time lag, is a combination of human and computer lag times.
Computer lag times very much shorter than human lag times will be
unimportant. What are human lag times? There is, of course, a range, from
low values close to 50 milliseconds, to high values close to half a
second. Computer lags much shorter than 50 msec ought to be unimportant.
Where do these numbers come from?
- visual integration times: 50 to 200 milliseconds,
- cross-modal judgments of simultaneity: 100 milliseconds
- shortest response times: 300 miliseconds
- apparent motion: 30 to 200 milliseconds
- Oscillatory solutions depend on the ability of the human motor system
to move, which is inhibited by inertia and by damping. How fast can
humans oscillate? Close to 50 Hz. We can assume that oscillatory
solutions much faster than about 5 Hz have to be modified by inertial and
damping effects not included in such simple equations. Does this mean
that high frequency oscillatory solutions don't matter? Not at all. When
they occur users just give up!
- Lots of neat information of this type can be found in the Handbook
of Perception and Human Performance.
This is a very simple case. When we let things get more complicated
qualitatively interesting things start to appear. The following points, which
might go under the heading of "qualitative dynamics", are things we think
about when thinking about the real-time properties of our interfaces. (If you
are interested in doing more than just reading this list please see your
nearest applied math textbook on dynamical systems, and work out the
analogies to user interface behaviour.)
- Dynamical systems are usually defined by the following:
- dynamical variables, that describe how the system behaves;
- control parameters, that are used to modify system behaviour (These
may be dynamical variables of another dynamical system!);
- equations, usually differential equations, that describe how the
the dynamical variables are affected by the control parameters.
- "Degrees of freedom" is an extremely useful concept in analysing
dynamical systems:
- dynamical degrees of freedom, independent ways in which a system
can respond;
- control degrees of freedom, independent ways in which a system can
be controlled;
- a simple result -- the number of independently controllable
dynamical degrees of freedom is less than or equal to the number of
independent control degrees of freedom;
- connection of controllability and observability to degrees of
freedom. (What is it?)
- When control parameters are changed the system normally settles down
toward a new state. These states are associated with attractors of the
dynamical system. Attractors come in three flavours:
- points
- limit cycles, speaking loosely, these are dense in the space of the
dynamical variables
- strange attractors, not dense.
Attractors categorize the dynamical space into basins of attraction.
Points that separate basins of attraction are places in the dynamical
space where infinitesimal changes in the underlying system cause finite
changes in the result. These are the places that are both valuable and
dangerous. E.g. the set of pixels right that define the border of a
pop-up menu. This cannot be adequately stressed; these are the places
where quantitative changes in control parameters produce qualitative
changes in behaviour of the application. The existence or expectation of
hysteresis in the user, or in the system should always be considered near
such points.
Discrete Control
Many interfaces depend only on the order in which operations are
performed, and not on when they are performed. Furthermore they have
particularly simple input/output structures. Consider, for example, an
interface that is used for filling and submitting a form, then receiving the
results. Draw the state diagram for such an interface:
- List of states of program:
- Not accepting input.
- Accepting input for Field 1.
- Accepting input for Field 2.
- Processing form.
- Form accepted.
- Form rejected.
- List of states of interface:
- No field selected.
- Field 1 selected.
- Field 2 selected.
- Form submitted.
- Form returned: correct.
- Form returned: error.
- Actions that initiate transitions out of states:
- New form
- Choose field1
- Choose field2
- Submit form
- Remote response to submission
- Modal responses to returned forms
- Where do the transitions go?
- Exactly what action initiates a transition? And who does it, user or
application?
- Is the interface controllable?
- Can the get to a desired state from the state he or she is in?
- Is the interface observable?
- Is there a visible control indicating affordances of a state?
- (One for each transition)
- What about long-range navigation?
- Are there pathologies?
- actions that are ignored,
- cycles you can't leave,
- dead ends,
- Have you put in something for everything that is "abnormal"?
- abandoned transactions,
- time outs,
- returning things (information, physical objects like bank
cards),
- and so on.
Doing this provides a pretty good explanation of what is going on. It is,
however, extremely tedious.
This technique of interface design has obvious benefits, such as
- It's a tool that computer scientists already understand fairly well
from other contexts. Transfer of learning.
- It's a tool with formal properties that can be exploited. For example,
- Every state must exit.
- There must be a well-defined start.
- "Subroutines" can be used.
- "Data" is reasonably understood.
- and so on.
- Application/interface/user synchronization is explicit.
- The technique is roughly application-independent.
- It can profitably be used for designing parts of an interface. E.g.,
vi, Adobe Illustrator.
- It can profitably be used for explaining parts of an interface. E.g.,
PBX interfaces.
- Automated tools are at least conceivable.
There are also a variety of fairly obvious drawbacks, or at least
complications, such as,
- There is no way of handling time that is better than temporal order.
Time-outs are clearly a hack. (How long do you have to wait at a bank
machine before the machine concludes that a transaction is
incomplete?)
- Complexity kills it pretty fast. Consider the state diagram for an
editor.
- Obviously more suited to some interface devices and some user tasks
than others. For example, producing a business letter might seem similar
to filling in a form. How useful is the state diagram we produced
above?
- How much should a user be aware of the underlying state machine? Is
there value to the user independent of such knowledge, which may be
implicit?
State diagrams are an obvious place where specific, well-focussed research is
possible that will improve user interface design.
History of the State Diagram Idea
Reality
Return to: