INTERFACE TO THE COMPILER

The compiler for a garbage-collected language interacts with the garbage collector by generating code that allocates records, by describing locations of roots for each garbage-collection cycle, and by describing the layout of data records on the heap. For some versions of incremental collection, the compiler must also generate instructions to implement a read or write barrier.

FAST ALLOCATION

Some coding languages, and some programs, allocate heap data (and generate garbage) very rapidly. This is especially true of programs in functional languages, where updating old data is discouraged. The most allocation (and garbage) one could imagine a reasonable program generating is one word of allocation per store instruction; this is because each word of a heap-allocated record is usually initialized. Empirical measurements show that about one in every seven instructions executed is a store, almost regardless of coding language or program. Thus, we have (at most) Java ScreenShot word of allocation per instruction executed. Supposing that the cost of garbage collection can be made small by proper tuning of a generational collector, there may still be a considerable cost to create the heap records. To minimize this cost, copying collection should be used so that the allocation space is a contiguous free region; the next free location is next and the end of the region is limit. To allocate one record of size N, the steps are

  1. Call the allocate function.
  2. Test next + N < limit ? (If the test fails, call the garbage collector.)
  3. Move next into result
  4. Clear M[next], M[next + 1],…, M[next + N1]
  5. nextnext + N
  6. Return from the allocate function.
  1. Move result into some computationally useful place.
  2. Store useful values into the record.

Steps 1 and 6 should be eliminated by inline expanding the allocate function at each place where a record is allocated. Step 3 can often be eliminated by combining it with step A, and step 4 can be eliminated in favor of step B (steps A and B are not numbered because they are part of the useful computation; they are not allocation overhead). Steps 2 and 5 cannot be eliminated, but if there is more than one allocation in the same basic block (or in the same trace; see ), the comparison and increment can be shared among multiple allocations. By keeping next and limit in registers, steps 2 and 5 can be done in a total of three instructions.

By this combination of techniques, the cost of allocating a record - and then eventually garbage collecting it - can be brought down to about four instructions. This means that coding techniques such as the persistent binary search tree (page 108) can be efficient enough for everyday use.

DESCRIBING DATA LAYOUTS

The collector must be able to operate on records of all types: list, tree, or whatever the program has declared. It must be able to determine the number of fields in each record, and whether each field is a pointer. For statically typed languages such as Pascal, or for object-oriented languages such as Java or Modula-3, the simplest way to identify heap objects is to have the first word of every object point to a special type- or class-descriptor record. This record tells the total size of the object and the location of each pointer field. For statically typed languages this is an overhead of one word per record to serve the garbage collector. But object-oriented languages need this descriptor pointer in every object just to implement dynamic method lookup, so that there is no additional per-object overhead attributable to garbage collection. The type- or class-descriptor must be generated by the compiler from the static type information calculated by the semantic analysis phase of the compiler. The descriptor pointer will be the argument to the runtime system's alloc function. In addition to describing every heap record, the compiler must identify to the collector every pointer-containing temporary and local variable, whether it is in a register or in an activation record. Because the set of live temporaries can change at every instruction, the pointer map is different at every point in the program. Therefore, it is simpler to describe the pointer map only at points where a new garbage collection can begin. These are at calls to the alloc function; and also, since any function call might be calling a function which in turn calls alloc, the pointer map must be described at each function call. The pointer map is best keyed by return addresses: A function call at location a is best described by its return address immediately after a, because the return address is what the collector will see in the very next activation record. The data structure maps return addresses to live-pointer sets; for each pointer that is live immediately after the call, the pointer map tells its register or frame location. To find all the roots, the collector starts at the top of the stack and scans downward, frame by frame. Each return address keys the pointer-map entry that describes the next frame. In each frame, the collector marks (or forwards, if copying collection) from the pointers in that frame.

Callee-save registers need special handling. Suppose function f calls g, which calls h. Function h knows that it saved some of the callee-save registers in its frame and mentions this fact in its pointer map; but h does not know which of these registers are pointers. Therefore the pointer map for g must describe which of its callee-save registers contain pointers at the call to h and which are "inherited" from f.

DERIVED POINTERS

Sometimes a compiled program has a pointer that points into the middle of a heap record, or that points before or after the record. For example, the expression a[i-2000] can be calculated internally as M[a-2000+i]:Java ScreenShot

If the expression a[i-2000] occurs inside a loop, the compiler might choose to hoist t1a − 2000 outside the loop to avoid recalculating it in each iteration. If the loop also contains an alloc, and a garbage collection occurs while t1 is live, will the collector be confused by a pointer t1 that does not point to the beginning of an object, or (worse yet) that points to an unrelated object? We say that the t1 is derived from the base pointer a. The pointer map must identify each derived pointer and tell the base pointer from which it is derived. Then, when the collector relocates a to address a′, it must adjust t1 to point to address t1 + a′ − a. Of course, this means that a must remain live as long as t1 is live. Consider the loop at left, implemented as shown at right:

let r1 ← 100
 var a := intarray[100] of 0 r2 ← 0
 call alloc
 ar1
 in t1a - 2000
 for i := 1930 to 1990 i ← 1930
 do f(a[i-2000]) L1 : r1M[t1 + i]
 call f end L2 : if i ≤ 1990 goto L1

If there are no other uses of a, then the temporary a appears dead after the assignment to t1. But then the pointer map associated with the return address L2 would not be able to "explain" t1 adequately. Therefore, for purposes of the compiler's liveness analysis, a derived pointer implicitly keeps its base pointer live.