MIT 6.044J/18.423J                                        Handout 25
6.044 Fall 1996                                           14 December, 1996
A. R. Meyer
                        
                    Compiling Scheme to Nano-Computers

Nano-computers

Our objective in these notes will be to identify some very simple machine
models which are nevertheless powerful enough to run Scheme.  Since even
real RISC micro-processors are vastly more complex than the simple
machines we consider, we'll refer to our machines as nano-computers :-) .

Of course even though in theory Scheme can be compiled into nano-computer
code, we are not suggesting that this would be a sensible thing to do.
The resulting compiled code is spectacularly inefficient, and there would
be no point in trying to run it.  Rather, what we learn from this
reduction of Scheme to a nano-computer is that apparently very limited
nano-computers are as resistant to uniform analysis as much richer Scheme
expressions.

For example, the compilation process transforms the Halting Problem for
Scheme into the Halting Problem for nano-computers.  Hence the
Nano-Computer Computer Halting Problem is "just as hard" as that for
Scheme.  In particular, the Nano-Computer Computer Halting Problem is
undecidable.

It is worth emphasizing that the definition of undecidability refers to
Scheme procedures.  Undecidability of the Nano-Computer Halting Problem
means that SCHEME programs are incapable of deciding whether nano-computer
programs halt.  So even using a richly featured language, it is not
possible to predict the behavior of extremely limited programs.

FOOTNOTE: (For certain kinds of "simple" machines, it is possible to show
that the Halting Problem for Simple Machines cannot be decided BY "simple"
machines.  For example, we might choose simple machines to be the kind of
"pushdown-store" machines used to parse inputs according to a BNF grammar.
These machines are too limited to include a decider for their own halting
problem.  But undecidability BY simple machines need not imply that a
problem is really hard.  For example, there is a rather efficient Scheme
procedure which decides the Halting Problem for Pushdown-Store machines.)
END-FOOTNOTE

A result like this provides some valuable guidance about where the "hard
part" of program analysis resides.  For example, an important part of real
compiler technology involves program analyses which are far more detailed
than just determining the halting property.  How can compilers do this
analysis, since we know such analysis is undecidable?  The answer is that
compilers DON'T fully analyze ALL possible source programs--they do the
best they can.  For some classes of programs, compilers can use procedures
which are guaranteed to provide complete analyses; for programs which fall
outside of the classes which the compiler can handle, there is no
guarantee of the quality of analysis or of the resulting compiled code.

It would be a waste of time for compiler designer to try developing
perfect analysis methods for some unanalyzable subclass of programs.  From
the compilation of Scheme to nano-computers, we learn, for example, that
simply cutting down on the built-in operations in a language cannot be
expected to yield an analyzable class of programs.  In this way,
identifying simple unanalyzable/undecidable subclasses of programs gives
useful guidance on where NOT to waste time.

The insight that even very simple programs have undecidable properties
plays an additional important role in the Theory of Computation.  It is a
key technical fact used in establishing the pervasiveness of
undecidability in subject areas far removed from Computer Science.

So far we have seen only "incestuous" undecidable problems: problems about
programs which cannot be decided by programs.  Later we will discover
undecidable problems about polynomials, simple algebras, and simple
grammars, for example.  The proofs will be based on learning to "program"
with polynomials, algebraic equations, etc., so they simulate more
familiar kinds of programs.  But these problems from outside Computer
Science were not intended to embody programming, and it is generally very
difficult to program with them.  For example, writing algebraic equations
which, in a suitable sense, "simulate," Scheme programs would be an
extremely cumbersome exercise.  On the other hand, as should be expected,
it is much easier to write equations which simulate nano-computer
programs.  But simulating nano-computers is sufficient, since
nano-computers in turn can simulate Scheme.

One of simplest nano-computers we consider is the n-Counter machine.  This
is a machine with n registers for some n > 0, each register capable of
holding a nonnegative integer of arbitrary size.  The only operations of
the machine are to increment (add one) to any register, decrement any
register (leaving it unchanged if the register contains zero), and branch
on whether a register contains zero.  The registers are called "counters"
since in a single step they can only be incremented and decremented.  For
example, in 2-counter machines, the only instructions are:

instruction             informal meaning
-----------             ----------------

inc1                    increment first register
inc2                    increment second register
dec1                    decrement first register,
                          except leave it alone if it is 0.
dec2                    decrement second register,
                          except leave it alone if it is 0.
ifr1 n1 n2              if register 1 is zero, go to line n1,
                          otherwise go to line n2.
ifr2 n1 n2              if register 2 is zero, go to line n1,
                          otherwise go to line n2.

A 2-counter machine, M, is a finite sequence of such instructions,
numbered consecutively starting at zero.  For example, the machine M1
below, adds the contents of its first counter to its second counter and
then halts (by transfering to an instruction number larger than any in the
program).

                           2-counter machine M1
                           --------------------

                     0:   ifr1 4 1        ;line 4 means "halt"
                     1:   dec1
                     2:   inc2
                     3:   ifr1 0 0        ;goto line 0

The "configuration" of a 2-CM is defined to be a triple (k, l, m) of
nonnegative integers.  The intended interpretation of the triple is that k
is the number of the next instruction to be executed, and l and m are the
contents of counters 1 and 2.  Formally, we associate with 2-CM, M, a
partial function step_M from configurations to configurations, where
step_m(k, l, m) is the configuration, if any, after execution of M's k^th
instruction when counter 1 contains l, and counter 2 contains m.  For
example, for the machine above

            step_M1(0, 2, 0) = (1, 2, 0),
            step_M1(1, 2, 0) = (2, 1, 0),
            step_M1(2, 1, 0) = (3, 1, 1),
            step_M1(3, 1, 1) = (0, 1, 1).


Exercise 1:  Give a precise definition of the function step_M for an
arbitrary 2-CM, M.


Now given the limited structure of counter machines, it is not even
apparent what it would MEAN to compile Scheme onto one of them.  Basic
Scheme printable values such as symbols, lists or even negative integers
are not available in counter machines, so we cannot expect a counter
machine program to produce the same printable values as a Scheme source
program.

The obvious thing to do would be to define some representation of Scheme
values in terms of counter machine values, namely, nonnegative integers.
We would then require that for any Scheme "source" expression, S, the
compiled code--a counter machine we will call M_S--halts with the contents
of some designated "output" counter equal to the nonnegative integer
representation of the value of S.  However, for our purposes it is not
necessary to develop such a representation of Scheme values.  Namely, to
conclude that the halting problem for various nano-computers is
undecidable, it is enough to simulate halting behavior only.

We say a 2-CM HALTS iff it halts when started in the "standard" initial
configuration (0, 0, 0).  Our main result is:


THEOREM: (Scheme->2-CM) There is a procedure for translating any Scheme
expression, S, into a 2-counter machine, M_S, such that
                          S halts iff M_S halts.

From this we immediately conclude

COROLLARY: The 2-Counter Machine Halting Problem is undecidable.


Going from Scheme expression, S, to 2-CM, M_S, is a long trip, and we need
some "rest stops" along the way.  The beginning leg of the trip is the
longest, ending at the first rest stop: Register Machines with a Heap
(Heap-RegM's).

Heap-RegM's are the machines which Abelson/Sussman use as a target for the
Scheme compiler described in their book.  These are machines with a fixed
number of registers capable of holding symbols, integers, and CONS-CELLS
whose CAR's and CDR's are register contents.

In their initial development of Heap-RegM's, Abelson/Sussman assume
Heap-RegM's have some moderately high-level operations (eg,
"LOOKUP-IN-ENVIRONMENT").  Subsequently they explain how to implement all
such operations using basic operations on integers, booleans, and symbols,
along with only the heap operations CONS, CAR, CDR, SET-CAR!, SET-CDR!.

The second rest stop is also described by Abelson/Sussman.  They show how
to represent and garbage-collect the heap of cons-cells and their contents
with Random Access Machines with Indirect Memory References (RAM-w/ref's).
These are machines with an array of registers numbered consecutively
starting at zero.  Each register may hold an arbitrary nonnegative
integer.  The operations of the RAM-w/ref consist of familiar arithmetic
operations.  Registers may be referenced directly, as in the assignment
statement:
                              R4 := R4 + R5
Executing this instruction changes the contents of Register 4 to be the
sum of its current contents and the contents of Register 5.

Registers may also be referenced INDIRECTLY, as in
                         ref(R4) := R3 * ref(R6)
Executing this instruction sets the contents of Register m to be the
product of the contents of Register 3 and the contents of Register n,
where m is the contents of Register 4 and n is the contents of Register 6.
Test instructions like
        if LESS-THAN?(R4, ref(R1)) goto instruction 7 else goto 9
may also refer to registers directly or indirectly.  A RAM-w/ref is
defined to be a finite sequence of such instructions.

Formally, we can define a configuration of a RAM-w/ref to be a sequence
                           <n_1, n_2, ..., n_k>
of k>=1 nonnegative integers, where n_1 specifies the number of the
next instruction to be executed, n_2, n_3, ..., n_k are the contents
of registers 0, 1, ..., k-2, and all registers numbered k-1 or more are
empty, i.e., contain zero.

For definiteness, we may assume that the integer operations of RAM-w/ref's
are exactly +, *, - (restricted to return zero if a result would be
negative), the test operations are exactly < and =, and all instructions
are of one of the forms illustrated above.

EXERCISE 2.
Let M be a RAM-w/ref.  Give a precise definition of the function step_M
mapping a configuration of M to the "next" configuration, if any, which
follows after executing the designated next instruction of M.

So with Abelson/Sussman as our guides on these first legs of the trip, we
arrive at

THEOREM (Scheme->RAM-w/ref).  There is a procedure for translating any
Scheme expression, S, into a RAM-w/ref, R_S, such that
                          S halts iff R_S halts.

This translation is normally referred to as compilation.  Of course, the
compiled version, R_S, of S captures far more than just the halting
behavior of S.  But as explained above, for our purposes Theorem
Scheme->RAM-w/ref is all we need to know about "real" compilation.

The next rest stop will be "ordinary" Register Machines (RegM's).  These
are essentially the same as RAM-w/ref's except that indirect register
references are not allowed.  This means that only a FIXED number of
registers can be accessed by a given RegM.

A RAM-w/ref, R, can be simulated with a RegM, G_R, by coding the entire
current configuration of R into ONE number which G_R stores in some
register.  Let PAIR be some function which codes any pair of nonnegative
integers into a single integer, e.g, pair(m, n) = 2^m * 3^n.  Now code a
configuration <n_1, n_2, ..., n_k> of R into the single number, rep(<n_1,
n_2, ..., n_k>), defined to be
       pair(k, pair(n_0, pair(n_1, ..., pair(n_(k-1), n_k) ...))).

Now it is not hard to see how to program a RegM to update
                        rep(<n_1, n_2, ..., n_k>) into
                    rep(step_R(<n_1, n_2, ..., n_k>)), as indicated in the
next exercise.  In fact, besides the register used to store the number
representing the R's configuration, G_R only needs some fixed number of
registers, say 8, for various temporary results.  In other words, we can
be sure that G_R is a 9-Register Machine, no matter how many registers R
may access.

EXERCISE 3.  (a) Exhibit a RegM which, started in configuration, (0, k, l)
halts in configuration (n, m) where m = pair(k, l) and n = length(RegM).

(b) Exhibit a RegM which, started with an arbitrary rep(<n_1, n_2, ...,
n_k>) in register 1, and an integer i in register 2, halts with registers
1 and 2 unchanged, and the integer n_i in register 3 if i <= k; if i>k, it
should halt with zero in register 3.

So we have

THEOREM (RAM-w/ref->RegM).  There is a procedure for translating any
Random Access Machines with Indirect Memory References, R, into a
9-Register Machine, G_R, such that
                          R halts iff G_R halts.

Next, it is easy to see how to write RegM routines which add, subtract,
multiply, compare, or copy registers using only increment, decrement, and
branch on zero instructions, which leads to our penultimate rest stop:


THEOREM (RegM->CM).  There is a procedure for translating any n-Register
machine, G, into an (n+9)-Counter machine, C_G, such that
                          G halts iff C_G halts.


EXERCISE 4.

(a) Exhibit an 9-counter machine which, started in configuration <0, k, l,
0, ... , 0> halts in a configuration of the form <0, k, l, k+l, 0,..., 0>.

(b) Same as (a) with "+" replaced by "*".


Finally, a k-Counter machine, C, can be simulated by 2-counter machine,
M_C, which keeps in its first counter the code number representing the
contents n_1, n_2, ..., n_k of C's registers.  The code number will be
                  2^{n_1} * 3^{n_2} * ... * {p_k}^{n_k}
where p_k is the k^th prime number.  To simulate an increment instruction
on C's register 2, for example, M_C need only multiply the code number by
the 3.  This it can do by repeatedly decrementing its first register once
while incrementing its second register three times until the first
register is empty.  Then it copies the second register back into the first
by repeatedly decrementing the second and incrementing the first until
the second is empty.

It can test C's register 2 for zero by dividing the code number by 3 and
branching on the remainder.  This it can do by repeatedly decrementing its
first register three times while incrementing its second register once
until the first register is empty.  At this point the quotient of the code
number divided by 3 is in register 2, and number of decrements left
unperformed among the final group of three is the remainder.  Before
branching on the remainder, M_C can restore the code number into register
1 by reversing the divide-by-3 process it just performed.

This sketch should be enough to understand how to complete the final leg
of our trip:


THEOREM (n-CM->2-CM).  There is a procedure for translating any
n-Counter machine, C, into an 2-Counter machine, M_C, such that
                          C halts iff M_C halts.

Combining all the legs of our trip completes the journey from Scheme to
2-CM's.