Previous Next |
Register allocation for expression trees is much simpler than for arbitrary flow graphs. We do not need global dataflow analysis or interference graphs. Suppose we have a tiled tree such as in Image 9.2a. This tree has two trivial tiles, the TEMP nodes fp and i, which we assume are already in registers r_{fp} and r_{i} . We wish to label the roots of the nontrivial tiles (the ones corresponding to instructions, i.e., 2, 4, 5, 6, 8) with registers from the list r_{1}, r_{2},…, r_{k}. Algorithm 11.9 traverses the tree in postorder, assigning a register to the root of each tile. With n initialized to zero, this algorithm applied to the root (tile 9) produces the allocation {tile2 ↦ r_{1}, tile4 ↦ r_{2}, tile5 ↦ r_{2}, tile6 ↦ r_{1}, tile8 ↦ r_{2}, tile9 ↦ r_{1}}. The algorithm can be combined with Maximal Munch, since both algorithms are doing the same bottom-up traversal.
ALGORITHM 11.9: Simple register allocation on trees.function SimpleAlloc(t)
for each nontrivial tile u that is a child of t
SimpleAlloc(u)
for each nontrivial tile u that is a child of t
n ← n - 1
n ← n + 1
assign r_{n} to hold the value at the root of t
But this algorithm will not always lead to an optimal allocation. Consider the following tree, where each tile is shown as a single node:
The SimpleAlloc function will use three registers for this expression (as shown at left on the next page), but by reordering the instructions we can do the computation using only two registers (as shown at right):
r_{1} ← M[a] |
r_{1} ← M[b] |
r_{2} ← M[b] |
r_{2} ← M[c] |
r_{3} ← M[c] |
r_{1} ← r_{1} × r_{2} |
r_{2} ← r_{2} × r_{3} |
r_{2} ← M[a] |
r_{1} ← r_{1} + r_{2} |
r_{1} ← r_{2} + r_{1} |
Using dynamic programming, we can find the optimal ordering for the instructions. The idea is to label each tile with the number of registers it needs during its evaluation. Suppose a tile t has two nontrivial children u_{left} and u_{right} that require n and m registers, respectively, for their evaluation. If we evaluate u_{left} first, and hold its result in one register while we evaluate u_{right}, then we have needed max(n, 1 + m) registers for the whole expression rooted at t. Conversely, if we evaluate u_{right} first, then we need max(1 + n, m) registers. Clearly, if n > m, we should evaluate u_{left} first, and if n < m, we should evaluate u_{right} first. If n = m, we will need n + 1 registers no matter which subexpression is evaluated first. Algorithm 11.10 labels each tile t with need[t], the number of registers needed to evaluate the subtree rooted at t. It can be generalized to handle tiles with more than two children. Maximal Munch should identify - but not emit - the tiles, simultaneously with the labeling of Algorithm 11.10. The next pass emits Assem instructions for the tiles; wherever a tile has more than one child, the subtrees must be emitted in decreasing order of register need.
ALGORITHM 11.10: Sethi-Ullman labeling algorithm.function Label(t)
for each tile u that is a child of t
Label(u)
if t is trivial
then need[t] ← 0
else if t has two children, u_{left} and u_{right}
then if need[u^{left}] = need[u_{right}]
then need[t] ← 1 + need[u_{left}]
else need[t] ← max(1, need[u_{left}], need[u_{right}])
else if t has one child, u
then need[t] ← max(1, need[u]
else if t has no children
then need[t] ← 1
Algorithm 11.10 can profitably be used in a compiler that uses graph-coloring register allocation. Emitting the subtrees in decreasing order of need will minimize the number of simultaneously live temporaries and reduce the number of spills. In a compiler without graph-coloring register allocation, Algorithm 11.10 is used as a pre-pass to Algorithm 11.11, which assigns registers as the trees are emitted and also handles spilling cleanly. This takes care of register allocation for the internal nodes of expression trees; allocating registers for explicit TEMPsofthe Tree language would have to be done in some other way. In general, such a compiler would keep almost all program variables in the stack frame, so there would not be many of these explicit TEMPs to allocate.
ALGORITHM 11.11: Sethi-Ullman register allocation for trees.function SethiUllman(t, n)
if t has two children, u_{left} and u_{right}
if need[u_{left}] ≥ K ∧ need[u_{right}] ≥ K
SethiUllman(u_{right}, 0)
n ← n - 1
spill: emit instruction to store reg[u_{right}]
SethiUllman(u_{left}, 0)
unspill: reg[u_{right}] ← "r_{1}"; emit instruction to fetch reg[u_{right}]
else if need[u_{left}] ≥ need[u_{right}]
SethiUllman(u_{left}, n)
SethiUllman(u_{right}, n + 1)
else need[u_{left}] < need[u_{right}]
SethiUllman(u_{right}, n)
SethiUllman(u_{left}, n)
reg[t] ← "r_{n}"
emit OPER(instruction[t], reg[t], [ reg[u_{left}], reg[u_{right}]])
else if t has one child, u
SethiUllman(u, n)
reg[t] ← "r_{n}"
emit OPER(instruction[t], reg[t], [reg[u]])
else if t is nontrivial but has no children
reg[t] ← "r_{n}"
emit OPER(instruction[t], reg[t], [])
else if t is a trivial node TEMP(r_{i})
reg[t] ← "r_{i}"
JaVa | Previous Next |