TAMING CONDITIONAL BRANCHES

Another aspect of the Tree language that has no direct equivalent in most machine instruction sets is the two-way branch of the CJUMP instruction. The Tree language CJUMP is designed with two target labels for convenience in translating into trees and analyzing trees. On a real machine, the conditional jump either transfers control (on a true condition) or "falls through" to the next instruction. To make the trees easy to translate into machine instructions, we will rearrange them so that every CJUMP(cond, l_t, l_f) is immediately followed by LABEL(l_f), its "false branch." Each such CJUMP can be directly implemented on a real machine as a conditional branch to label l_t. We will make this transformation in two stages: First, we take the list of canonical trees and form them into basic blocks; then we order the basic blocks into a trace. The next sections will define these terms.

BASIC BLOCKS

In determining where the jumps go in a program, we are analyzing the program's control flow. Control flow is the sequencing of instructions in a program, ignoring the data values in registers and memory, and ignoring the arithmetic calculations. Of course, not knowing the data values means we cannot know whether the conditional jumps will go to their true or false labels; so we simply say that such jumps can go either way. In analyzing the control flow of a program, any instruction that is not a jump has an entirely uninteresting behavior. We can lump together any sequence of nonbranch instructions into a basic block and analyze the control flow between basic blocks. A basic block is a sequence of statements that is always entered at the beginning and exited at the end, that is:

The first statement is a LABEL.
The last statement is a JUMP or CJUMP.
There are no other LABELs, JUMPs, or CJUMPs.

The algorithm for dividing a long sequence of statements into basic blocks is quite simple. The sequence is scanned from beginning to end; whenever a LABEL is found, a new block is started (and the previous block is ended); whenever a JUMP or CJUMP is found, a block is ended (and the next block is started). If this leaves any block not ending with a JUMP or CJUMP, then a JUMP to the next block's label is appended to the block. If any block has been left without a LABEL at the beginning, a new label is invented and stuck there. We will apply this algorithm to each function-body in turn. The procedure "epilogue" (which pops the stack and returns to the caller) will not be part of this body, but is intended to follow the last statement. When the flow of program execution reaches the end of the last block, the epilogue should follow. But it is inconvenient to have a "special" block that must come last and that has no JUMP at the end. Thus, we will invent a new label done - intended to mean the beginning of the epilogue - and put a JUMP(NAME done) at the end of the last block.

In the MiniJava compiler, the class Canon.BasicBlocks implements this simple algorithm.

TRACES

Now the basic blocks can be arranged in any order, and the result of executing the program will be the same - every block ends with a jump to the appropriate place. We can take advantage of this to choose an ordering of the blocks satisfying the condition that each CJUMP is followed by its false label. At the same time, we can also arrange that many of the unconditional JUMPs are immediately followed by their target label. This will allow the deletion of these jumps, which will make the compiled program run a bit faster. A trace is a sequence of statements that could be consecutively executed during the execution of the program. It can include conditional branches. A program has many different, overlapping traces. For our purposes in arranging CJUMPs and false-labels, we want to make a set of traces that exactly covers the program: Each block must be in exactly one trace. To minimize the number of JUMPs from one trace to another, we would like to have as few traces as possible in our covering set. A very simple algorithm will suffice to find a covering set of traces. The idea is to start with some block - the beginning of a trace - and follow a possible execution path - the rest of the trace. Suppose block b₁ ends with a JUMP to b₄, and b₄ has a JUMP to b₆. Then we can make the trace b₁, b₄, b₆. But suppose b₆ ends with a conditional jump CJUMP(cond, b₇, b₃). We cannot know at compile time whether b₇ or b₃ will be next. But we can assume that some execution will follow b₃, so let us imagine it is that execution that we are simulating. Thus, we append b₃ to our trace and continue with the rest of the trace after b₃. The block b₇ will be in some other trace. Algorithm 8.2 (which is similar to Canon.TraceSchedule) ordersthe blocks into traces as follows: It starts with some block and follows a chain of jumps, marking each block and appending it to the current trace. Eventually it comes to a block whose successors are all marked, so it ends the trace and picks an unmarked block to start the next trace.ALGORITHM 8.2: Generation of traces.

Put all the blocks of the program into a list Q.
while Q is not empty
 Start a new (empty) trace, call it T.
 Remove the head element b from Q.
 while b is not marked
 Mark b; append b to the end of the current trace T.
 Examine the successors of b (the blocks to which b branches);
 if there is any unmarked successor c
 b ← c
 End the current trace T.

FINISHING UP

An efficient compiler will keep the statements grouped into basic blocks, because many kinds of analysis and optimization algorithms run faster on (relatively few) basic blocks than on (relatively many) individual statements. For the MiniJava compiler, however, we seek simplicity in the implementation of later phases. So we will flatten the ordered list of traces back into one long list of statements. At this point, most (but not all) CJUMPs will be followed by their true or false label. We perform some minor adjustments:

Any CJUMP immediately followed by its false label we let alone (there will be many of these).
For any CJUMP followed by its true label, we switch the true and false labels and negate the condition.
For any CJUMP(cond, a, b, l_t, l_f) followed by neither label, we invent a new false label l_f′ and rewrite the single CJUMP statement as three statements, just to achieve the condition that the CJUMP is followed by its false label:
```
CJUMP(cond, a, b, l_t, l_f′)
LABEL l_f′
JUMP(NAME l_f)
```

The trace-generating algorithm will tend to order the blocks so that many of the unconditional JUMPs are immediately followed by their target labels. We can remove such jumps.

OPTIMAL TRACES

For some apps of traces, it is important that any frequently executed sequence of instructions (such as the body of a loop) should occupy its own trace. This helps not only to minimize the number of unconditional jumps, but also may help with other kinds of optimizations, such as register allocation and instruction scheduling. Image 8.3 shows the same program organized into traces in different ways. Image 8.3a has a CJUMP and a JUMP in every iteration of the while-loop; Image 8.3b uses a different trace covering, also with CJUMP and a JUMP in every iteration. But Image 8.3c shows a better trace covering, with no JUMP in each iteration.
Image 8.3: Different trace coverings for the same program.

The MiniJava compiler's Canon module doesn't attempt to optimize traces around loops, but it is sufficient for the purpose of cleaning up the Tree-statement lists for generating assembly code.