INTERMEDIATE REPRESENTATION FOR FLOW ANALYSIS

In this chapter we will consider intraprocedural global optimization. Intraprocedural means the analysis stays within a single procedure or function (of a language like MiniJava); global means that the analysis spans all the statements or basic blocks within that procedure. Interprocedural optimization is more global, operating on several procedures and functions at once. Each of the optimizing transformations listed at the beginning of the chapter can be applied using the following generic recipe:

There are many dataflow analyses that can provide useful information for optimizing transformations. Like the liveness analysis described in , most can be described by dataflow equations, a set of simultaneous equations derived from nodes in the flow graph.

QUADRUPLES

's liveness analysis operates on Assem instructions, which clearly indicate uses and defs but whose actual operations are machine-dependent assembly-language strings. Liveness analysis, and register allocation based on it, do not need to know what operations the instructions are performing, just their uses and definitions. But for the analyses and optimizations in this chapter, we need to understand the operations as well. Therefore, instead of Assem instructions we will use Tree-language terms (), simplified even further by ensuring that each Exp has only a single MEM or BINOP node. We can easily turn ordinary Tree expressions into simplified ones. Wherever there is a nested expression of one BINOP or MEM inside another, or a BINOP or MEM inside a JUMP or CJUMP, we introduce a new temporary using ESEQ:Java ScreenShot

and then apply the Canon module to remove all the ESEQ nodes. We also introduce new temporaries to ensure that any store statement (that is, a MOVE whose left-hand side is a MEM node) has only a TEMP or a CONST on its right-hand side, and only a TEMP or CONST under the MEM. The statements that remain are all quite simple; they take one of the forms shown in .

Java ScreenShot
Table 17.1: Quadruples expressed in the Tree language. Occurrences of a, b, c, f, L denote TEMP, CONST, or LABEL nodes only.

Java ScreenShot

Java ScreenShot

Because the "typical" statement is a ← bc with four components (a, b, c, ⊕), these simple statements are often called quadruples. We use ⊕ to stand for an arbitrary binop. A more efficient compiler would represent quadruples with their own data type (instead of using Tree data structures), and would translate from trees to quadruples all in one pass. Intraprocedural optimizations take these quadruples that come out of the Canon phase of the compiler, and transform them into a new set of quadruples. The optimizer may move, insert, delete, and modify the quadruples. The resulting procedure body must then be fed into the instruction-selection phase of the compiler. However, the tree matching will not be very effective on the "atomized" trees where each expression contains only one BINOP or MOVE. After the optimizations are completed, there will be many MOVE statements that define temporaries that are used only once. It will be necessary to find these and turn them back into nested expressions.

We make a control flow graph of the quadruples, with a directed edge from each node (statement) n to its successors - that is, the nodes that can execute immediately after n.