MIT 6.044J/18.423J Handout 25 6.044 Fall 1996 14 December, 1996 A. R. Meyer Compiling Scheme to Nano-Computers Nano-computers Our objective in these notes will be to identify some very simple machine models which are nevertheless powerful enough to run Scheme. Since even real RISC micro-processors are vastly more complex than the simple machines we consider, we'll refer to our machines as nano-computers :-) . Of course even though in theory Scheme can be compiled into nano-computer code, we are not suggesting that this would be a sensible thing to do. The resulting compiled code is spectacularly inefficient, and there would be no point in trying to run it. Rather, what we learn from this reduction of Scheme to a nano-computer is that apparently very limited nano-computers are as resistant to uniform analysis as much richer Scheme expressions. For example, the compilation process transforms the Halting Problem for Scheme into the Halting Problem for nano-computers. Hence the Nano-Computer Computer Halting Problem is "just as hard" as that for Scheme. In particular, the Nano-Computer Computer Halting Problem is undecidable. It is worth emphasizing that the definition of undecidability refers to Scheme procedures. Undecidability of the Nano-Computer Halting Problem means that SCHEME programs are incapable of deciding whether nano-computer programs halt. So even using a richly featured language, it is not possible to predict the behavior of extremely limited programs. FOOTNOTE: (For certain kinds of "simple" machines, it is possible to show that the Halting Problem for Simple Machines cannot be decided BY "simple" machines. For example, we might choose simple machines to be the kind of "pushdown-store" machines used to parse inputs according to a BNF grammar. These machines are too limited to include a decider for their own halting problem. But undecidability BY simple machines need not imply that a problem is really hard. For example, there is a rather efficient Scheme procedure which decides the Halting Problem for Pushdown-Store machines.) END-FOOTNOTE A result like this provides some valuable guidance about where the "hard part" of program analysis resides. For example, an important part of real compiler technology involves program analyses which are far more detailed than just determining the halting property. How can compilers do this analysis, since we know such analysis is undecidable? The answer is that compilers DON'T fully analyze ALL possible source programs--they do the best they can. For some classes of programs, compilers can use procedures which are guaranteed to provide complete analyses; for programs which fall outside of the classes which the compiler can handle, there is no guarantee of the quality of analysis or of the resulting compiled code. It would be a waste of time for compiler designer to try developing perfect analysis methods for some unanalyzable subclass of programs. From the compilation of Scheme to nano-computers, we learn, for example, that simply cutting down on the built-in operations in a language cannot be expected to yield an analyzable class of programs. In this way, identifying simple unanalyzable/undecidable subclasses of programs gives useful guidance on where NOT to waste time. The insight that even very simple programs have undecidable properties plays an additional important role in the Theory of Computation. It is a key technical fact used in establishing the pervasiveness of undecidability in subject areas far removed from Computer Science. So far we have seen only "incestuous" undecidable problems: problems about programs which cannot be decided by programs. Later we will discover undecidable problems about polynomials, simple algebras, and simple grammars, for example. The proofs will be based on learning to "program" with polynomials, algebraic equations, etc., so they simulate more familiar kinds of programs. But these problems from outside Computer Science were not intended to embody programming, and it is generally very difficult to program with them. For example, writing algebraic equations which, in a suitable sense, "simulate," Scheme programs would be an extremely cumbersome exercise. On the other hand, as should be expected, it is much easier to write equations which simulate nano-computer programs. But simulating nano-computers is sufficient, since nano-computers in turn can simulate Scheme. One of simplest nano-computers we consider is the n-Counter machine. This is a machine with n registers for some n > 0, each register capable of holding a nonnegative integer of arbitrary size. The only operations of the machine are to increment (add one) to any register, decrement any register (leaving it unchanged if the register contains zero), and branch on whether a register contains zero. The registers are called "counters" since in a single step they can only be incremented and decremented. For example, in 2-counter machines, the only instructions are: instruction informal meaning ----------- ---------------- inc1 increment first register inc2 increment second register dec1 decrement first register, except leave it alone if it is 0. dec2 decrement second register, except leave it alone if it is 0. ifr1 n1 n2 if register 1 is zero, go to line n1, otherwise go to line n2. ifr2 n1 n2 if register 2 is zero, go to line n1, otherwise go to line n2. A 2-counter machine, M, is a finite sequence of such instructions, numbered consecutively starting at zero. For example, the machine M1 below, adds the contents of its first counter to its second counter and then halts (by transfering to an instruction number larger than any in the program). 2-counter machine M1 -------------------- 0: ifr1 4 1 ;line 4 means "halt" 1: dec1 2: inc2 3: ifr1 0 0 ;goto line 0 The "configuration" of a 2-CM is defined to be a triple (k, l, m) of nonnegative integers. The intended interpretation of the triple is that k is the number of the next instruction to be executed, and l and m are the contents of counters 1 and 2. Formally, we associate with 2-CM, M, a partial function step_M from configurations to configurations, where step_m(k, l, m) is the configuration, if any, after execution of M's k^th instruction when counter 1 contains l, and counter 2 contains m. For example, for the machine above step_M1(0, 2, 0) = (1, 2, 0), step_M1(1, 2, 0) = (2, 1, 0), step_M1(2, 1, 0) = (3, 1, 1), step_M1(3, 1, 1) = (0, 1, 1). Exercise 1: Give a precise definition of the function step_M for an arbitrary 2-CM, M. Now given the limited structure of counter machines, it is not even apparent what it would MEAN to compile Scheme onto one of them. Basic Scheme printable values such as symbols, lists or even negative integers are not available in counter machines, so we cannot expect a counter machine program to produce the same printable values as a Scheme source program. The obvious thing to do would be to define some representation of Scheme values in terms of counter machine values, namely, nonnegative integers. We would then require that for any Scheme "source" expression, S, the compiled code--a counter machine we will call M_S--halts with the contents of some designated "output" counter equal to the nonnegative integer representation of the value of S. However, for our purposes it is not necessary to develop such a representation of Scheme values. Namely, to conclude that the halting problem for various nano-computers is undecidable, it is enough to simulate halting behavior only. We say a 2-CM HALTS iff it halts when started in the "standard" initial configuration (0, 0, 0). Our main result is: THEOREM: (Scheme->2-CM) There is a procedure for translating any Scheme expression, S, into a 2-counter machine, M_S, such that S halts iff M_S halts. From this we immediately conclude COROLLARY: The 2-Counter Machine Halting Problem is undecidable. Going from Scheme expression, S, to 2-CM, M_S, is a long trip, and we need some "rest stops" along the way. The beginning leg of the trip is the longest, ending at the first rest stop: Register Machines with a Heap (Heap-RegM's). Heap-RegM's are the machines which Abelson/Sussman use as a target for the Scheme compiler described in their book. These are machines with a fixed number of registers capable of holding symbols, integers, and CONS-CELLS whose CAR's and CDR's are register contents. In their initial development of Heap-RegM's, Abelson/Sussman assume Heap-RegM's have some moderately high-level operations (eg, "LOOKUP-IN-ENVIRONMENT"). Subsequently they explain how to implement all such operations using basic operations on integers, booleans, and symbols, along with only the heap operations CONS, CAR, CDR, SET-CAR!, SET-CDR!. The second rest stop is also described by Abelson/Sussman. They show how to represent and garbage-collect the heap of cons-cells and their contents with Random Access Machines with Indirect Memory References (RAM-w/ref's). These are machines with an array of registers numbered consecutively starting at zero. Each register may hold an arbitrary nonnegative integer. The operations of the RAM-w/ref consist of familiar arithmetic operations. Registers may be referenced directly, as in the assignment statement: R4 := R4 + R5 Executing this instruction changes the contents of Register 4 to be the sum of its current contents and the contents of Register 5. Registers may also be referenced INDIRECTLY, as in ref(R4) := R3 * ref(R6) Executing this instruction sets the contents of Register m to be the product of the contents of Register 3 and the contents of Register n, where m is the contents of Register 4 and n is the contents of Register 6. Test instructions like if LESS-THAN?(R4, ref(R1)) goto instruction 7 else goto 9 may also refer to registers directly or indirectly. A RAM-w/ref is defined to be a finite sequence of such instructions. Formally, we can define a configuration of a RAM-w/ref to be a sequence of k>=1 nonnegative integers, where n_1 specifies the number of the next instruction to be executed, n_2, n_3, ..., n_k are the contents of registers 0, 1, ..., k-2, and all registers numbered k-1 or more are empty, i.e., contain zero. For definiteness, we may assume that the integer operations of RAM-w/ref's are exactly +, *, - (restricted to return zero if a result would be negative), the test operations are exactly < and =, and all instructions are of one of the forms illustrated above. EXERCISE 2. Let M be a RAM-w/ref. Give a precise definition of the function step_M mapping a configuration of M to the "next" configuration, if any, which follows after executing the designated next instruction of M. So with Abelson/Sussman as our guides on these first legs of the trip, we arrive at THEOREM (Scheme->RAM-w/ref). There is a procedure for translating any Scheme expression, S, into a RAM-w/ref, R_S, such that S halts iff R_S halts. This translation is normally referred to as compilation. Of course, the compiled version, R_S, of S captures far more than just the halting behavior of S. But as explained above, for our purposes Theorem Scheme->RAM-w/ref is all we need to know about "real" compilation. The next rest stop will be "ordinary" Register Machines (RegM's). These are essentially the same as RAM-w/ref's except that indirect register references are not allowed. This means that only a FIXED number of registers can be accessed by a given RegM. A RAM-w/ref, R, can be simulated with a RegM, G_R, by coding the entire current configuration of R into ONE number which G_R stores in some register. Let PAIR be some function which codes any pair of nonnegative integers into a single integer, e.g, pair(m, n) = 2^m * 3^n. Now code a configuration of R into the single number, rep(), defined to be pair(k, pair(n_0, pair(n_1, ..., pair(n_(k-1), n_k) ...))). Now it is not hard to see how to program a RegM to update rep() into rep(step_R()), as indicated in the next exercise. In fact, besides the register used to store the number representing the R's configuration, G_R only needs some fixed number of registers, say 8, for various temporary results. In other words, we can be sure that G_R is a 9-Register Machine, no matter how many registers R may access. EXERCISE 3. (a) Exhibit a RegM which, started in configuration, (0, k, l) halts in configuration (n, m) where m = pair(k, l) and n = length(RegM). (b) Exhibit a RegM which, started with an arbitrary rep() in register 1, and an integer i in register 2, halts with registers 1 and 2 unchanged, and the integer n_i in register 3 if i <= k; if i>k, it should halt with zero in register 3. So we have THEOREM (RAM-w/ref->RegM). There is a procedure for translating any Random Access Machines with Indirect Memory References, R, into a 9-Register Machine, G_R, such that R halts iff G_R halts. Next, it is easy to see how to write RegM routines which add, subtract, multiply, compare, or copy registers using only increment, decrement, and branch on zero instructions, which leads to our penultimate rest stop: THEOREM (RegM->CM). There is a procedure for translating any n-Register machine, G, into an (n+9)-Counter machine, C_G, such that G halts iff C_G halts. EXERCISE 4. (a) Exhibit an 9-counter machine which, started in configuration <0, k, l, 0, ... , 0> halts in a configuration of the form <0, k, l, k+l, 0,..., 0>. (b) Same as (a) with "+" replaced by "*". Finally, a k-Counter machine, C, can be simulated by 2-counter machine, M_C, which keeps in its first counter the code number representing the contents n_1, n_2, ..., n_k of C's registers. The code number will be 2^{n_1} * 3^{n_2} * ... * {p_k}^{n_k} where p_k is the k^th prime number. To simulate an increment instruction on C's register 2, for example, M_C need only multiply the code number by the 3. This it can do by repeatedly decrementing its first register once while incrementing its second register three times until the first register is empty. Then it copies the second register back into the first by repeatedly decrementing the second and incrementing the first until the second is empty. It can test C's register 2 for zero by dividing the code number by 3 and branching on the remainder. This it can do by repeatedly decrementing its first register three times while incrementing its second register once until the first register is empty. At this point the quotient of the code number divided by 3 is in register 2, and number of decrements left unperformed among the final group of three is the remainder. Before branching on the remainder, M_C can restore the code number into register 1 by reversing the divide-by-3 process it just performed. This sketch should be enough to understand how to complete the final leg of our trip: THEOREM (n-CM->2-CM). There is a procedure for translating any n-Counter machine, C, into an 2-Counter machine, M_C, such that C halts iff M_C halts. Combining all the legs of our trip completes the journey from Scheme to 2-CM's.