Previous    Next

## TAMING CONDITIONAL BRANCHES

Another aspect of the Tree language that has no direct equivalent in most machine instruction sets is the two-way branch of the CJUMP instruction. The Tree language CJUMP is designed with two target labels for convenience in translating into trees and analyzing trees. On a real machine, the conditional jump either transfers control (on a true condition) or "falls through" to the next instruction. To make the trees easy to translate into machine instructions, we will rearrange them so that every CJUMP(cond, lt, lf) is immediately followed by LABEL(lf), its "false branch." Each such CJUMP can be directly implemented on a real machine as a conditional branch to label lt. We will make this transformation in two stages: First, we take the list of canonical trees and form them into basic blocks; then we order the basic blocks into a trace. The next sections will define these terms.

### BASIC BLOCKS

In determining where the jumps go in a program, we are analyzing the program's control flow. Control flow is the sequencing of instructions in a program, ignoring the data values in registers and memory, and ignoring the arithmetic calculations. Of course, not knowing the data values means we cannot know whether the conditional jumps will go to their true or false labels; so we simply say that such jumps can go either way. In analyzing the control flow of a program, any instruction that is not a jump has an entirely uninteresting behavior. We can lump together any sequence of nonbranch instructions into a basic block and analyze the control flow between basic blocks. A basic block is a sequence of statements that is always entered at the beginning and exited at the end, that is:

• The first statement is a LABEL.

• The last statement is a JUMP or CJUMP.
• There are no other LABELs, JUMPs, or CJUMPs.

The algorithm for dividing a long sequence of statements into basic blocks is quite simple. The sequence is scanned from beginning to end; whenever a LABEL is found, a new block is started (and the previous block is ended); whenever a JUMP or CJUMP is found, a block is ended (and the next block is started). If this leaves any block not ending with a JUMP or CJUMP, then a JUMP to the next block's label is appended to the block. If any block has been left without a LABEL at the beginning, a new label is invented and stuck there. We will apply this algorithm to each function-body in turn. The procedure "epilogue" (which pops the stack and returns to the caller) will not be part of this body, but is intended to follow the last statement. When the flow of program execution reaches the end of the last block, the epilogue should follow. But it is inconvenient to have a "special" block that must come last and that has no JUMP at the end. Thus, we will invent a new label done - intended to mean the beginning of the epilogue - and put a JUMP(NAME done) at the end of the last block.

In the MiniJava compiler, the class Canon.BasicBlocks implements this simple algorithm.

### TRACES

ALGORITHM 8.2: Generation of traces.
```Put all the blocks of the program into a list Q.
while Q is not empty
Start a new (empty) trace, call it T.
Remove the head element b from Q.
while b is not marked
Mark b; append b to the end of the current trace T.
Examine the successors of b (the blocks to which b branches);
if there is any unmarked successor c
b ← c
End the current trace T.

```

### FINISHING UP

An efficient compiler will keep the statements grouped into basic blocks, because many kinds of analysis and optimization algorithms run faster on (relatively few) basic blocks than on (relatively many) individual statements. For the MiniJava compiler, however, we seek simplicity in the implementation of later phases. So we will flatten the ordered list of traces back into one long list of statements. At this point, most (but not all) CJUMPs will be followed by their true or false label. We perform some minor adjustments:

• Any CJUMP immediately followed by its false label we let alone (there will be many of these).

• For any CJUMP followed by its true label, we switch the true and false labels and negate the condition.
• For any CJUMP(cond, a, b, lt, lf) followed by neither label, we invent a new false label lf′ and rewrite the single CJUMP statement as three statements, just to achieve the condition that the CJUMP is followed by its false label:
```CJUMP(cond, a, b, lt, lf′)
LABEL lf′
JUMP(NAME lf)
```

The trace-generating algorithm will tend to order the blocks so that many of the unconditional JUMPs are immediately followed by their target labels. We can remove such jumps.

### OPTIMAL TRACES

For some apps of traces, it is important that any frequently executed sequence of instructions (such as the body of a loop) should occupy its own trace. This helps not only to minimize the number of unconditional jumps, but also may help with other kinds of optimizations, such as register allocation and instruction scheduling. Image 8.3 shows the same program organized into traces in different ways. Image 8.3a has a CJUMP and a JUMP in every iteration of the while-loop; Image 8.3b uses a different trace covering, also with CJUMP and a JUMP in every iteration. But Image 8.3c shows a better trace covering, with no JUMP in each iteration.

Image 8.3: Different trace coverings for the same program.

The MiniJava compiler's Canon module doesn't attempt to optimize traces around loops, but it is sufficient for the purpose of cleaning up the Tree-statement lists for generating assembly code.

 JaVa Previous    Next