The C language itself is beginning to complicate the creation of a self-documenting program with its use of escaped characters. Of course, this should not be taken as a flaw in the language; in any other application, the approach taken by C is suitable and even desirable. There simply is no way to discern a use of a double quote from a mention without additional information. That information comes in the form of the optional backslash. In the application of self-documenting code, however, it looks like another conceptual leap is in order.
That conceptual leap comes in the form of finding an alternate way to express these escaped characters. Obviously, the program is going to have to find some way to output a double quote - they will, by necessity, appear in the source code, after all! But there can be no double quotes in the actual instructions, because if a double quote appears in the program proper, it will also have to appear in the description of the program, in which case it will have to be escaped. As we have seen, this is forbidden.
Fortunately, there are other ways to output this character. The easiest is
putchar(34);. 34 is the standard
ASCII code for the double quote. The
putchar function traslates the code into the correct
character and outputs the double quote to the screen. Similarly,
putchar(10); will output a newline character.
This leaves one more step. We still need to output '
;' on either side of the string. We could use a series of statements
of the form:
and so on (the resulting code, Self 0, can be found in
Appendix A), but a slightly more elegant solution is to incorporate these extra
characters into the string used to describe the program. We can, at runtime,
selectively print sections of that string corresponding to different parts of
the program, by printing from an offset into the string and temporarily placing
an end-of-string marker at the end of the section we wish to print.
Combining these techniques generates the first working self-documenting program:
When Self I is compiled and run, it outputs an exact duplicate of itself, and is therefore a successful self-documenting program. Unfortunately, Self I is cryptic and unreadable, even by C standards! The lines are long, and spacing is minimal. Furthermore, there is no practical way to reformat this program while preserving its self-reference. We cannot break the definition of string f over several lines, because as we have seen, newlines must be escaped within a string. If we inserted a newline directly into f, the program would fail to compile.
That is not to say there is no such thing as an elegant self-documenting program.
It is in fact possible to write a much shorter program. The trick lies in
careful use of the
printf function and the fact that it allows the user to
include format specifiers to
reformat text before it gets printed out. Furthermore, the
format string passed to
printf is just a normal C string like any other. In fact, it is possible
to achieve self-reference by letting the string which describes the program also serve
as the format specifier for its own output! This idea gives us the following program:
Here, we have overcome the problem of outputting the characters which delimit
the string by making the string represent not just the use of the
program, but rather the program in its entirety. The string f represents
everything that needs to be printed out, with a few critical sections left out -
namely, the characters that need to be escaped, and the contents of the string.
That is the crucial idea behind Self II. The string f can avoid the
infinite recursion of including a copy of own its contents within itself by
specifying that some unknown string will later be substituted for the '
and then substituting f itself! This is why the
printf statement contains
f twice. The first time is as the format specifier, and the second is as the replacement
text for the '
%s' in f. And Self II has the added advantage of being
relatively easy to read and understand.
There are, of course, many more possibilities. There are numerous ways to rephrase these working programs, and even programs that attack the problem from a different angle (for an example, see Self IV in Appendix A). Unfortunately, there is no room here to discuss the strategies used by those programs. At the very least, the examples given above provide a good starting point for moving forward towards exploring those other ideas.